We present a survey of transport methods for 3-D
video ranging from early analog 3DTV systems to most recent
digital technologies that show promise in designing 3DTV systems
of tomorrow. Potential digital transport architectures for 3DTV
include the DVB architecture for broadcast and the Internet
Protocol (IP) architecture for wired or wireless streaming. There
are different multiview representation/compression methods for
delivering the 3-D experience, which provide a tradeoff between
compression efficiency, random access to views, and ease of rate
adaptation, including the “video-plus-depth” compressed representation
and various multiview video coding (MVC) options.
Commercial activities using these representations in broadcast
and IP streaming have emerged, and successful transport of such
data has been reported. Motivated by the growing impact of the
Internet protocol based media transport technologies, we focus on
the ubiquitous Internet as the network infrastructure of choice for
future 3DTV systems. Current research issues in unicast and multicast
mode multiview video streaming include network protocols
such as DCCP and peer-to-peer protocols, effective congestion
control, packet loss protection and concealment, video rate adaptation,
and network/service scalability. Examples of end-to-end
systems for multiview video streaming have been provided.
2. AKAR et al.: TRANSPORT METHODS IN 3DTV—A SURVEY 1623
A. Analog Transmission B. Digital Transmission
The first known experimental 3DTV broadcast in the U.S. With the ongoing transition from analogue to digital TV ser-
was on April 29, 1953 when a trial live broadcast of the se- vices, additional hope for 3DTV arose in the early 1990s; and
ries SPACE PATROL was run in Los Angeles at the National especially in Europe, a number of European-funded projects
Association of Radio and Television Broadcasters 31st Annual (e.g., COST 230, RACE DISTIMA, RACE PANORAMA,
gathering. The ABC affiliate station KECA-TV also aired the ACTS MIRAGE, ACTS TAPESTRIES) were set-up with the
show but viewers without a pair of special polarization lenses aim to develop standards, technologies and production facilities
saw only a blurred mess [1]. The first “nonexperimental” 3DTV for 3DTV [5]. Other groups, that focused on human factors re-
broadcast occurred about 30 years later. On December 19, 1980, quirements for high-quality stereoscopic television, also joined
a 3-D feature film, “Miss Sadie Thompson,” and a 3-D short star- the efforts [6]. Motivated by this revived interest in broadcast
ring the Three Stooges were aired over SelecTV, a Los Angeles 3DTV, the Moving Pictures Expert Group (MPEG) developed
Pay-TV system. This broadcast was made possible by a new a new compression technology for stereoscopic video as part
company, 3-D Video Corporation, who developed a working and of the successful MPEG-2 standard [7]. The chosen approach,
practical system for presenting 3-D on conventional television called the MPEG-2 multiview profile (MVP), can be regarded
sets using the anaglyph format [4]. This first broadcast on Se- as an extension of the temporal scalability tool. It encodes the
lecTV was B&W, i.e., the color of the original film was removed left-eye view as a base layer in conformance with the MPEG-2
prior to conversion with the 3DTV process. With the success of main profile (MP)—thus providing backwards-compatibility
this first broadcast, SelecTV asked 3-D Video Corporation to with the “conventional” 2-D digital TV receivers. The right-eye
convert another film, this time preferably in color. This has been view is encoded as an enhancement layer using the scalable
achieved on April 10, 1981, when SelecTV broadcasted MGM’s coding tools with additional prediction from the base layer.
musical classic, “Kiss Me Kate.” Following these early efforts, While MPEG-2 has become the underlying technology for
several other features and shorts were produced and broadcasted digital standard-definition and high-definition 2DTV broad-
world wide in subsequent years. casts worldwide, the MVP, unfortunately, has not found use in
In Europe, one of the earliest 3DTV activities were the ex- commercially available services.
perimental 3DTV broadcast programs that were transmitted in Some promising attempts have been made to integrate stereo-
1982 in several European countries [3]. These programs, which scopic TV and digital HDTV into a new, high-quality 3-D en-
were aired in a simple red/green anaglyph format, were initiated tertainment medium. Live broadcasting of stereoscopic HDTV
by H.-J. Herbst of the Norddeutscher Rundfunk (NDR), Ham- was tried in Japan during the Nagano Winter Games in 1998
burg, Germany. In cooperation with Philips Research Laborato- [8]. In cooperation with other organizations, TAO distributed
ries, Eindhoven, The Netherlandstwo popular-scientific 3-D se- right-eye and left-eye HDTV images of the events in Tokyo at a
ries were produced. Together with some further 3-D productions bit rate of 45 Mbps each via N-Star to NHK Broadcasting Center
by other European TV stations, these transmissions received an and Chiyoda Broadcasting Hall, both in Tokyo, where they were
extremely favorable response. More than 40 million red/green projected onto a large screen. These live 3-D images included
viewing spectacles were sold. The expectation, however, that those from hockey games and other events, impressing the au-
this could be “the TV of the future” was disillusioned by the dience with their powerful sense of reality.
poor visual quality attainable with the anaglyphic method, es- A similar experiment was conducted in Korea/Japan during
pecially if transmitted via an unmodified standard TV system. FIFA World Cup 2002 [9]. Using a terrestrial and satellite net-
A second activity that has given further stimulation to work, a compressed stereoscopic 3-D HDTV signal was multi-
the 3DTV research and development efforts was the 3DTV cast to seven predetermined demonstration venues, which were
demonstration in 1983 at the International Audio and Video approved by host broadcast service (HVS), KirchMedia, and
Fair in Berlin [3]. Because of the high public interest in the the FIFA. More specifically, the right-eye and left-eye HDTV
earlier, anaglyphic 3DTV transmissions, the German broad- images were compressed in the so-called side-by-side format
casters asked the Institut für Rundfunktechnik (IRT), Munich, (horizontally decimated by a factor of two and rearranged into
Germany, to also showcase the high-quality 3DTV system de- a single standard video field) using the MPEG-2 Main Pro-
veloped at IRT at that time. This stereoscopic 3-D system was file@High Level codec at a bit rate of 40 Mbps and transmitted
based on a standard phase alternating line (PAL) distribution in an MPEG-2 Transport Stream (DVB-ASI) over a DS-3 net-
chain that was operated in two-channel mode. For display, two work as specified in ITU-T Rec. G.703 [10]. At the receiving
video projectors with orthogonal polarization filters were used; venues, the stereoscopic images recovered by the 3-D HDTV
in order to separate the views, the users had to wear matching receiver were decoded, expanded horizontally, and displayed on
polarization glasses. The presentations were so successful that large screens with polarized beam projectors.
they were continued at the Audio and Video Fairs 1985 and Recently, research on 3DTV has moved its focus from the
1987. Unfortunately, because the transmission system required classical fixed-view stereoscopic television concept towards
a custom TV receiver, its use remained limited to large venue more flexible 3-D visual data representation formats. The
demonstrations at exhibitions, symposia, conferences, and Australian 3-D company DDD, for example, is marketing
workshops. multiview autostereoscopic 3-D displays, where the required
In June of 1991, John Wayne’s only 3-D film, Hondo, was views (typically 8 or 9 views) are generated from the so-called
broadcast on a network of 151 stations in the US. This was the “video-plus-depth” representation, which is a combination of
closest event to a “network” 3-D broadcast to that date. monoscopic color video and associated per-pixel depth maps
3. 1624 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. 11, NOVEMBER 2007
Fig. 1. Block diagram of the 3DTV over DVB-T demo [53].
[11]. They propose a system that encodes the depth data in a
proprietary, very low bit rate format, that is then transmitted in
the “private” or “user data” of an MPEG-2 Transport Stream
[12]. Required views are rendered at the receiver side by using
depth-image-based rendering (DIBR). A similar approach was
followed by the European IST project ATTEST (Advanced
Three-Dimensional Television Systems Technologies). Again,
the “video-plus-depth” representation was used for transmitting
3-D visual information. In contrast to the DDD concept, stan-
dard MPEG technologies were used for the compression of the
depth information as well [13], [14]. Within the project, it was
shown that by using the newest MPEG video coding standard
H.264/AVC (Advanced Video Coding), it is possible to com-
press typical depth data to bit rates of around 200–300 kbps. Fig. 2. Block diagram of a 3-D streaming system.
Thus, compared to a conventional 2-D digital TV broadcast,
the overhead required for the 3-D visual information is in the
area of only 10% (assuming a bit rate of around 3 Mbit/s for a to signal the existence of an encoded depth stream [2]. The
typical 2DTV transmission over DVB-C/S/T). actual compression of the per-pixel depth information, on the
First demonstration of a complete 3DTV system based on other side, has not been defined explicitly such that every con-
the ATTEST developments, whose block diagram is depicted ventional MPEG video codec (e.g., MPEG-2 or H.264/AVC)
in Fig. 1, was shown by Fraunhofer HHI at the International can be used for this purpose. The new standard has been pub-
Broadcast Convention (IBC) in 2004 [15]. The distribution side lished in two parts: The specification of the depth format itself is
of the demonstration consisted of a DTV-Recorder-Generator called ISO/IEC 23002-3 (MPEG-C), and a method for transmit-
(DVRG) from Rhode & Schwartz, which was used for the ting “video-plus-depth” within a conventional MPEG-2 Trans-
real-time replay of an offline-generated MPEG-2 Transport port Stream has become an amendment (Amd. 2) to ISO/IEC
Stream. By means of a connected DVB-T sender (SFQ), the 13818-1 (MPEG-2 Systems). Both standards have been final-
3DTV signal was aired at the booth. The transmitter power ized at the MPEG meeting in Marrakech, Morocco (January
was adjusted such that reception of the signal could be en- 2007). Technical details can be found in the standardization doc-
sured within an area of a few square-meters. The MPEG-2 TS uments in [54] and [55].
contained two 3-D programs each comprised of an MPEG-2
III. 3DTV OVER IP NETWORKS
coded color video stream as well as an associated H.264/AVC
coded depth-image sequence. The synchronization of the two The IP is proving to be very flexible in accommodating a wide
bit streams was assured by conforming to the tools described in range of communication services as can be seen from the on-
the MPEG-2 Systems specification and their respective amend- going replacement of classical telephone services by voice over
ments. The receiver side consisted of a conventional desktop IP applications. Transmission of video over the Internet is cur-
PC with a PCI DVB-T card from TechnoTrend. The received rently an active research and development area, where significant
MPEG-2 Transport Stream was demultiplexed in software and results have already been achieved. There are already video-on-
the respective synchronized video bit streams were decoded in demand services, both for news and entertainment applications,
real-time and forwarded to a 3-D renderer module which gener- offered over the Internet. Also, 2.5G and 3G mobile network op-
ated “virtual” stereoscopic views for display on the Fraunhofer erators started to use IP successfully to offer wireless video ser-
HHI autostereoscopic Free2C 3-D display. This was the first vices. Looking at these advances, the transport of 3DTV signals
demo of a 3DTV service based on the “video-plus-depth” 3-D over IP packet networks seems to be a natural choice. The IP
data representation format using a real DVB-T transmission. itself leaves many aspects of the transmission to be defined by
The “video-plus-depth” representation has been standardized other layers of the protocol stack and, thus, offers flexibility in de-
within MPEG as a result of work initiated by Philips and Fraun- signing the optimal communications system for various 3-D data
hofer HHI. Only the representation has been standardized—by representations and encoding schemes. 3DTV streaming archi-
means of metadata which conveys the meaning of the graylevel tectures (see Fig. 2) can be classified as: 1) server unicasting to a
values in the depth imagery—and some additional metadata single client; 2) server multicasting to several clients; 3) peer-to-
4. AKAR et al.: TRANSPORT METHODS IN 3DTV—A SURVEY 1625
peer (P2P) unicasting, where each peer forwards packets to an- this to the server. The server then selects the minimum of these
other peer; and 4) P2P multicasting, where each peer forwards rates. However, only a limited number of selected clients are al-
packets to several other peers. Multiview video streaming proto- lowed to send their TFRC rates to the server in order to prevent
cols can be RTP/UDP/IP [16], which is the current state of the feedback explosion. In the case of DCCP, again each client mea-
art, or RTP/DCCP/IP [18], which is the next generation protocol. sures their RTT and loss-rates and send them to the server, and
Multicasting protocols can be supported at the network-layer or the TCP-friendly rate is computed at the server based on the re-
application layer. In the following, we first give an overview of ceived feedback.
the streaming protocols in Section III-A. Then, data representa- Hence, future 3DTV over IP services is expected to employ
tion, encoding, and rate adaptation for multiview video will be the DCCP protocol with effective video rate adaptation to match
discussed in Section III-B. Section III-C reviews methods for the TFRC rate. Multiview video source rate adaptation strategies
packet loss resilience. Examples of demonstration systems are are discussed in the following.
reviewed in Section III-D.
B. Multiview Video Encoding and Rate Allocation/Adaptation
A. Streaming Protocols For streaming applications, multiview 3-D video can be
Today, the most widely used transport protocol for media/ represented and encoded either implicitly, in the so-called
multimedia is the real-time transport protocol (RTP) over UDP “video-plus-depth” representation, or explicitly in raw form.
[16]. However, RTP/UDP does not contain any congestion con- Representation and encoding of “video-plus-depth” data was
trol mechanism and, therefore, can lead to congestion collapse briefly discussed in Section II-B. There are various approaches
when large volumes of multiview video are delivered. The data- for representation and encoding of multiview raw video, which
gram congestion control protocol (DCCP) [18] is designed as a provide a trade-off between random access, ease of rate adap-
replacement for UDP for media delivery, running directly over tation, and compression efficiency. These approaches include
the Internet Protocol (IP) to provide congestion control without simulcast coding, scalable simulcast coding, multiview coding
reliability. DCCP can be thought as TCP minus reliability and [21]–[26], and scalable multiview coding [27], [28]. A com-
in-order packet delivery, or as UDP plus congestion control, plete treatment of 3-D and free viewpoint video representations
connection setup, and acknowledgements. and their compression is given in [29].
The DCCP is a transport protocol that implements bi-direc- In streaming multiview video over the Internet, the video rate
tional unicast connections of congestion-controlled, unreliable must be adapted to the available throughput and/or the TFRC rate
datagrams. Despite of the unreliable datagram flow, DCCP pro- in order to avoid congestion to be friendly with other TCP traffic.
vides reliable handshakes for connection setup/teardown and The rate adaptation of stereo and multiview video differs from
reliable negotiation of options. Besides handshakes and fea- that of monocular video, since rate allocation between views
ture negotiation, DCCP also accommodates a choice of mod- offers new flexibilities. According to the suppression theory of
ular congestion control mechanisms. There exist two conges- human visual perception of 3-D from stereoscopic video, if the
tion control schemes defined in DCCP currently, one of which right and left views are transmitted and displayed with unequal
is to be selected at connection startup time. These are TCP- spatial, temporal and/or quality resolutions, the overall 3-D
like Congestion Control [19] and TCP-Friendly Rate Control video quality is determined by the view with the better resolution
(TFRC) [20]. TCP-like Congestion Control, identified by Con- [56]. Therefore, rate adaptation of multiview video may be
gestion Control Identifier 2 (CCID2) in DCCP, behaves similar achieved at constant perceived 3-D video quality by adaptation
to TCP’s Additive Increase Multiplicative Decrease (AIMD) of the spatial, temporal and/or signal-to-noise (SNR) resolution
congestion control, halving the congestion window in response of one of the views while encoding/transmitting the other view
to a packet drop. Applications using this congestion control at full rate. Several open loop and closed loop rate adaptation
mechanism will respond quickly to changes in available band- strategies for stereo and multiview video at the server and client
width, but must tolerate the abrupt changes in the congestion side are studied for UDP and DCCP protocols. In the closed loop
window size typical of TCP. On the other hand, TFRC, which rate adaptation, each client estimates some function of the re-
is identified by CCID3, is a form of equation-based flow control ceived signal and feeds it back to the transmitter. The transmitter
that minimizes abrupt changes in the sending rate while main- determines an optimized rate for the next transmission based
taining longer-term fairness with TCP. It is hence appropriate on the received feedback. In the open loop rate adaptation, the
for applications that would prefer a rather smooth sending-rate, transmitter does not use any feedback from the receiver.
including streaming media applications with a small or mod- In [30] and [31], open-loop rate adaptation strategies for
erate receiver buffer. In its operation, CCID3/TFRC calculates stereo and multiview video at the server side for UDP and
an allowed sending rate, called the TFRC rate, by using the TCP DCCP protocols are studied. In [30], rate adaptation has been
throughput equation, which is provided to the sender application achieved by downscaling one of the views using: 1) spatial sub-
upon request. The sender may use this rate information to adjust sampling; 2) temporal subsampling; 3) scaling the quantization
its transmission rate in order to get better results. There is also
step-size; and 4) content-adaptive scaling. In content-adaptive
an experimental RFC for TCP-Friendly Multicast Congestion
video scaling, the right video is first divided into temporal
Control (TFMC).1 In order to compute the TFRC rate in a mul-
segments (shots or subshots) using well-known temporal seg-
ticast scenario, each receiver computes their own TFRC rate as
mentation methods. The temporal segments (shots) are then
a function of their own measured RTT and loss rate, and sends
classified into four categories as determined by their low-level
1[Online]. Available: http://www.ietf.org/rfc/rfc4654.txt?number=4654 attributes such as the amount of motion and spatial detail
5. 1626 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. 11, NOVEMBER 2007
within the segment. Shots with high temporal activity (high lightfield streaming. In their approach, the lightfield data set is
motion) need to be encoded at full temporal resolution for a transformed into blocks of wavelet coefficients; each block is
smooth viewing experience. On the other hand, if a somewhat then coded as a scalable bit stream and stored at the sender. To
stationary shot is being encoded, the temporal sampling rate render a frame, the receiver issues a request for relevant data.
can be reduced to a lower value without any loss of perceptual Based on the request, the estimated state of the data already at
quality. Likewise, shots with high spatial detail should not be the receiver, the network characteristics, and the desired trans-
reduced to lower spatial resolutions, while downsampling can mission rate, the sender performs rate-distortion optimized bit
be applied to shots with low spatial detail. Experimental results allocation as a convex optimization process, customizing the out-
show that content-adaptive approach for temporal and spatial going packets to minimize the distortion of the frame rendered at
down-sampling of one of the views yields better compression the receiver for the given transmission rate. Experimental results
with higher perceptual quality. with a statistical network model show that the proposed rate-
In [31], the video is encoded offline with a predetermined distortion optimized scheme reduces the required bit rate by
number of spatial, temporal and SNR scalability layers. Con- 10%–25% over a heuristic scheme at the same render quality.
tent-aware bit allocation among the views is performed during
bit stream extraction by adaptive selection of the number of spa- C. Error Correction and Concealment
tial, temporal, and SNR scalability layers for each group of pic- Streaming media applications often suffer from packet losses
tures (GOP) according to the motion and the spatial activity of in the wired or wireless IP links. Congestion is the main cause
that GOP. If the GOP has low spatial detail which is shown by of packet losses over the wired Internet. In contrast to the wired
the spatial measure, only the base SNR layer is extracted. For backbone, the capacity of the wireless channel is fundamentally
high spatial detail, both base SNR and fine granular scalable limited by the available bandwidth of the radio spectrum and
(FGS) layers are extracted. Similarly, for a low motion GOP, various types of noise and interference, which lead to bit errors.
only quarter temporal resolution is extracted, whereas for a high Most network protocols discard packets with bit errors; thus,
motion GOP half temporal resolution is extracted. The required translating bit errors into packet losses. Therefore, the wireless
bit rate reduction is only applied to one of the views. In the ex- channel is the “weakest link” of future multimedia networks
periments, the sequences are encoded offline with three tem- and, hence, requires special attention, especially when mobility
poral layers per view and with single FGS layer on top of the gives rise to fading and error bursts. In particular, joint source
base quality layer. The results show that adaptive selection of and channel coding techniques have been developed for the ef-
temporal levels and quality layers provides better rate-distortion ficient transmission of video streams over packet erasure chan-
performance compared to static cases. nels, both in wired and wireless networks [50], [51]. Further-
In [32] and [33], closed loop strategies have been proposed more, error concealment methods at the decoder must be con-
where rate adaptation is done at the server side by feedback sidered in order to limit the damage, especially due to temporal
from the user. In [32], a client-driven multiview video streaming error propagation, resulting from unpreventable packet losses.
system is simulated that allows a user to watch 3-D video in- Common error correction approaches for reliable trans-
teractively with significantly reduced bandwidth requirements mission of monoscopic video over packet networks include
by transmitting a small number of views selected according to retransmission requests (ARQ) as in [36] or forward error cor-
his/her head position. The user’s head position is tracked and rection (FEC) as in [37]–[39]. ARQ methods, which require
predicted into the future to select the views that best match the feedback (ACK) messages that inform the sender about the re-
user’s current viewing angle dynamically. Prediction of future liable reception of the data, may be effective to deal with packet
head positions is needed so that views matching the predicted losses if sufficient playout (preroll) delay is allowed at the client.
head positions can be requested from the server ahead of time in It may be more desirable to employ time-limited ARQ at the
order to account for delays due to network transport and stream application layer over the UDP or DCCP protocol, which allows
switching. The system allocates more bandwidth to the selected ARQ only within a limited period (less than the preroll delay at
views in order to render the current viewing angle. The pro- the client) as opposed to unlimited ARQ at the network layer (as
posed system makes use of multiview coding (MVC) and scal- in the TCP protocol). In cases where feedback channel cannot
able video coding (SVC) concepts together to obtain improved be used extensively, such as in broadcast and multicast services,
compression efficiency while providing flexibility in bandwidth channel coding techniques have been widely applied to combat
allocation to the selected views. Rate-distortion performance of with transmission errors. Advanced channel coding techniques
the system has been demonstrated under different conditions. for transmission of 3-D data have been reported in [40]–[42].
In [33], this idea is extended to a multicast scenario where each In [41], the transmission of multiview video encoded streams
view is streamed to a different IP-multicast address. A viewer’s over packet erasure networks is examined. Macroblock classi-
client joins appropriate multicast groups to only receive the 3-D fication into unequally important slice groups is achieved using
information relevant to its current viewpoint. The set of selected the flexible macroblock ordering (FMO) tool of H.264/AVC.
videos changes in real time as the user’s viewpoint changes. The Systematic LT codes [52] are used for error protection due to
performance of the approach has been studied through network their low complexity and advanced performance. The optimal
experiments. slice grouping and channel rate allocation is determined by an it-
The transmission of another promising new 3-D data repre- erative optimization algorithm based on dynamic programming.
sentation format [35] is assessed by Chang and Girod [34] who The optimization procedure starts by determining the channel
developed a rate-distortion optimized scheme for interactive protection of each frame. Then macroblocks are classified into
6. AKAR et al.: TRANSPORT METHODS IN 3DTV—A SURVEY 1627
Fig. 3. A block diagram of a end-to end stereoscopic video streaming test-bed.
slice groups and optimal channel protection for this classification In [43], an error concealment algorithm that fully makes use
is found. The next step is to calculate the expected distortion of the characteristics of a stereoscopic video sequence based
of allowable neighboring macroblock classifications with the on a relativity analysis is proposed. Based on the relativity of
restriction that a single packet can be exchanged between succes- prediction modes for right frames, the prediction mode of each
sive groups. The last step includes comparison of the distortion macroblock in the lost frame is chosen, and finally utilized to
of the ancestor classification and lowest average distortion of all restore the lost macroblock according to the estimated motion
descendant classifications. Based on this comparison either the vector or disparity vector. Experimental results show that the
previous steps are repeated or the algorithm is terminated. Even proposed algorithm can restore the lost frame with good quality,
though it has been shown that the proposed algorithm performs and that is efficient for the error concealment of entire lost right
significantly better than multiview coding schemes using one frames in a stereoscopic video sequence.
slice group, no results for the computational complexity is given. To estimate the capabilities of error concealment for stereo-
Stereoscopic video streaming using FEC techniques are scopic image data, a strategy for concealment of block bursts
examined in [42]. Frames are classified according to their in independently coded images was studied in [44] and [45]
contribution to the overall quality, and then used as layers of assuming block based video coding and loss of consecutive
the video. Since losing I-frame causes large distortions due to blocks. To increase the quality of the reconstructed block,
motion/disparity compensation and error propagation, I-frames additional information from the corresponding view is utilized.
should be protected the most. Among P-frames, left frames are As in a stereoscopic sequence, samples of both views corre-
more important since they can be encoded without the help spond to each other through the 3-D geometry of the scene and
of right frames. According to this prioritization of the frames, camera properties, the two views are highly correlated. Due to
three layers are formed. These three layers of stereoscopic this high correlation between the views, information about the
video are used for unequal error protection (UEP). A com- corresponding region is highly useful for the reconstruction of
parative analysis of Reed–Solomon (RS) and systematic Luby the lost block. First, corresponding pixel pairs (matches) around
transform (LT) codes are provided via simulations to observe the erroneous region are identified using feature matching and
the optimum packetization and UEP strategies. principles of epipolar geometry. To reduce the negative effect
When dealing with low bit rate videos, packet losses may lead of outliers, i.e., badly localized matches, robust estimation of
to the loss of an entire frame of video. Several studies exist in the transformation parameters is used [45].
the literature on frame loss concealment algorithms for mono-
scopic video, but these methods may not be directly applicable D. 3D Video Streaming Demonstrations
to stereoscopic video [57]–[62]. The reconstruction of lost in- Several studies were reported in the literature for end-to-end
formation is principally based on a priori knowledge about the 3-D video streaming systems. In [46], a 3DTV prototype system,
characteristics of the error and the lost data. Many strategies with real-time acquisition, transmission, and auto- stereoscopic
are based on interpolation of the surrounding image data into display of dynamic scenes has been presented by MERL. This
the lost region. While in a monoscopic scenario interpolation system is composed of a multiprojector 3-D display, an array of
techniques yield satisfactory results, they are not sufficient for cameras and network connected PCs. Multiple video streams
example in a stereoscopic scenario because the information on are individually encoded and sent over a broadband network to
depth is not preserved. Human perception of errors in 3-D video the display. The 3-D display shows high-resolution stereoscopic
data is different than in the 2-D case. In [44], it has been shown color images for multiple viewpoints without special glasses.
that even a small degradation in one of the views result in a sig- This system uses light-field rendering to synthesize views at
nificant perceptual distortion. the correct virtual camera positions.
7. 1628 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. 11, NOVEMBER 2007
Fig. 4. Stereoscopic projection display system.
In [47], a streaming solution which is based on depth-image- transport. It is understood that 3DTV transmission may create
based rendering is proposed. An efficient content delivery ar- the largest resource demands faced by the network infrastruc-
chitecture based on resource sharing in groups of collaborating ture up to now. While the transport solutions must address back-
network hosts is also proposed. wards compatibility issues with the existing digital TV stan-
Recently, an end-to-end prototype system for point-to-point dards and infrastructure, and, hence, can only support a lim-
streaming of stereoscopic video over UDP was demonstrated ited set of 3-D data representations and rendering technologies,
at the IST 2006 event [48]. A block-diagram of the prototype the streaming over IP solutions offer more flexibility to sup-
system is shown in Fig. 3. Multiple clients have been developed port different 3-D displays, 3-D data representations, and com-
by modifying the VideoLAN client for different 3-D displays. pression options. As a result, while we reviewed a specific 3-D
The prototype system operates over a LAN with no packet data representation, the “video-plus-depth” representation and
losses. The server employs the protocol stack RTP/UDP/IP, and image based rendering technology for digital 3DTV broadcast,
can serve multiple clients simultaneously. The session descrip- we aimed to review the general state of the art in streaming pro-
tion protocol (SDP) is used to ensure interoperability with the tocols, multiview video compression standards, and rate adap-
clients. Three clients have been implemented for different types tation, packet loss protection methods, and research directions
of display systems: Client-1 supports the autostereoscopic for streaming 3DTV over IP.
Sharp 3-D laptop, Client-2 supports a monocular display to The current and future research issues for 3-D TV transmis-
demonstrate backwards compatibility, and finally Client-3 sion can be summarized as: 1) The choice of the joint transport
supports an in-house polarized 3-D projection display system and coding because the gains obtained in one of them can easily
that uses a pair of Sharp MB-70X projectors as shown in Fig. 4. be nullified by the other one; 2) determination of the best rate
Another demonstration that was built on the concept of adaptation method—adaptation refers to adaptation of the rate
transmitting “video-plus-depth” (disparity) information over of each view as well as inter-view rate allocation depending on
IP was the VIRTUE 3-D videoconferencing system [49]. The available network rate and video content, and adaptation of the
innovative approach combined 3-D video and VR techniques in number and quality of views transmitted depending on available
a mixed reality application, providing immersive telepresence network rate and user display technology and desired viewpoint;
as well as a natural conferee representation in a shared col- and 3) error resilient video encoding and streaming strategies
utilizing the 3-D structure.
laboration space. The generated 3-D data was encoded using
MPEG-4 technologies and streamed over a packet-switched ACKNOWLEDGMENT
network using RTP/UDP with the payload formats defined of
for MPEG-4 audio/visual streams. The authors would like to thank C. Bilen and A. Aksay from
the Middle East Technical University, S. Pehlivan, E. Kurutepe,
IV. DISCUSSION AND CONCLUSION B. Gorkemli, and G. Gurler from Koç University, and N. Ozbek
from Ege University for their help and support.
A comprehensive survey of the state-of-the art in transport
techniques that are potentially applicable to 3DTV transmission REFERENCES
has been presented. A particular emphasis is given to packet net- [1] R. M. Hayes, 3D Movies: A History and Filmography of Stereoscopic
works using the IP, which plays an integrating role for all media Cinema. New York: McFarland, 1998.
8. AKAR et al.: TRANSPORT METHODS IN 3DTV—A SURVEY 1629
[2] C. Fehn, “3D-TV broadcasting,” in 3D Video Communication, O. [30] A. Aksay, C. Bilen, E. Kurutepe, T. Ozcelebi, G. Bozdagi-Akar, M. R.
Schreer, P. Kauff, and T. Sikora, Eds. New Yorj: Wiley, 2005. Civanlar, and A. M. Tekalp, “Temporal and spatial scaling for stereo-
[3] R. Sand, “3D-TV—A review of recent and current developments,” in scopic video compression,” presented at the 14th Eur. Signal Process.
Proc. IEE Colloq. Stereoscopic Television, London, UK, Oct. 1992, pp. Conf. (EURASIP), Florence, Italy, Sep. 2006.
1–4. [31] N. Ozbek and A. M. Tekalp, “Content-aware bit allocation in scal-
[4] Dimension 3D, Natural Vision (Anaglyph). 2007 [Online]. Available: able multiview video coding,” in Proc. Int. Workshop Multimedia Con-
http://www.3dcompany.com/nvhist.html tent Representat., Classificat. Security (MCRS), LCNS 4105, 2006, pp.
[5] W. A. IJsselsteijn, P. J. H. Seuntiëns, and L. M. J. Meesters, “State-of- 691–698.
the-art in human factors and quality issues of stereoscopic broadcast [32] E. Kurutepe, M. R. Civanlar, and A. M. Tekalp, “A receiver-driven
television,” IST-2001-34396 (ATTEST), Aug. 2002, Tech. Rep. D1. multicasting framework for 3DTV transmission,” presented at the
[6] S. Pastoor, “3D-Television: A survey of recent research results on sub- 13th Eur. Signal Process. Conf. (EURASIP), Antalya, Turkey, Sep.
jective requirements,” Signal Process.: Image Commun., vol. 4, no. 1, 2005.
pp. 21–32, Nov. 1991. [33] E. Kurutepe, M. R. Civanlar, and A. M. Tekalp, “Interactive transport
[7] H. Imaizumi and A. Luthra, “Stereoscopic video compression stan- of multiview videos for 3DTV applications,” J. Zhejiang Univ. Science
dard—MPEG-2 multiview profile,” in Three-Dimensional Television, A, vol. 7, no. 5, pp. 830–836, 2006.
Video, and Display Technologies, B. Javidi and F. Okano, Eds. New [34] C. Chang and B. Girod, “Rate-distortion optimized interactive
York: Springer-Verlag, 2002, pp. 169–181. streaming for scalable bit streams of light fields,” in Proc. SPIE
[8] I. Yuyama I. and M. Okui, “Stereoscopic HDTV,” in Three-Dimen- Vis. Commun. Image Process. (VCIP), San Jose, CA, Jan. 2004, pp.
sional Television, Video, and Display Technologies, B. Javidi and F. 222–233.
Okano, Eds. New York: Springer-Verlag, 2002, pp. 3–34. [35] M. Levoy and P. Hanrahan, “Lightfield rendering,” in Proc. ACM SIG-
[9] N. Hur, C.-H. Ahn, and C. Ahn, “Experimental Service of 3DTV GRAPH, New Orleans, LA, Aug. 1996, pp. 31–42.
Broadcasting Relay in Korea,” in Proc. SPIE Three-Dimensional TV, [36] G. J. Conklin, G. S. Greenbaum, K. O. Lillevold, A. F. Lippman, Y.
Video, and Display, Boston, MA, Aug. 2002, pp. 1–13. A. Reznik, R. N. Inc, and W. A. Seattle, “Video coding for streaming
[10] Physical/Electrical Characteristics of Hierarchical Digital Interfaces, media delivery on the internet,” IEEE Trans. Circuits Syst. Video
ITU-T Rec. Series ITU-T Rec. G.703, 2001. Technol., vol. 11, no. 3, pp. 269–281, Mar. 2001.
[11] P. Harman, “An architecture for digital 3-D broadcasting,” in Proc. [37] M. Link, B. Girod, K. Stuhlmüller, and U. Horn, “Packet loss resilient
SPIE Stereoscopic Displays Virtual Reality Syst. VI, San Jose, CA, Jan. internet video streaming,” in Proc. SPIE Visual Commun. Image
1999, pp. 254–259. Process. (VCIP), San Jose, CA, Jan. 1999, pp. 833–844.
[12] J. Flack, P. Harman, and S. Fox, “Low bandwidth stereoscopic image [38] H. Cai, B. Zeng, G. Shen, Z. Xiong, and S. Li, “Error-resilient un-
encoding and transmission,” in Proc. SPIE Stereoscop. Displ. Virtual equal error protection of fine granularity scalable video bit streams,”
Reality Syst. X, Santa Clara, CA, May 2003, pp. 206–215. EURASIP J. Appl. Signal Process., Special Issue on Advanced Video
[13] C. Fehn, “Depth-image-based rendering (DIBR), compression and Technologies and Applications for H.265/AVC and Beyond, vol. 2006,
transmission for a new approach on 3DTV,” in Proc. SPIE Stereoscop. pp. 1–11, 2006, 45412.
Displ. Virtual Reality Syst. XI, San Jose, CA, Jan. 2004, pp. 93–104. [39] J. W. Pei and Y. Modestino, “H.263+ packet video over wireless ip
[14] C. Fehn, “A 3DTV system based on video plus depth information,” pre- networks using rate-compatible punctured turbo (RCPT) codes with
sented at the 37th Asilomar Conf. Signals, Syst. Comp., Pacific Grove, joint source-channel coding,” in Proc. IEEE Int. Conf. Image Process.
CA, Nov. 2003. (ICIP), Rochester, NY, Sep. 2002, pp. 541–544.
[15] International Broadcasting Convention (IBC 2004) 2004 [Online]. [40] P. Y. Malcolm, J. A. Fernando, W. A. C. Loo, K. K. Arachchi, and H.
Available: http://ip.hhi.de/ibc2004.htm K. Yip, “Joint source and channel coding for H.264 compliant stereo-
[16] Y. Kikuchi, T. Nomura, S. Fukunaga, Y. Matsui, and H. Kimata, scopic video transmission,” in Proc. Can. Conf. Electr. Comput. Eng.,
“RTP payload format for MPEG-4 audio/visual streams,” Internet Saskatoon, SK, Canada, May 2005, pp. 188–191.
Engineering Task Force Nov. 2000, RFC 3016. [41] S. Argyropoulos, A. S. Tan, N. Thomos, E. Arikan, and M. G. Strintzis,
[17] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “RTP: A “Robust transmission of multiview video streams using flexible mac-
transport protocol for real-time applications,” Internet Engineering roblock ordering and systematic LT codes,” presented at the 3DTV-
Task Force Jul. 2003, RFC 3550. CON, Kos Island, Greece, May 2007.
[18] E. Kohler, M. Handley, and S. Floyd, “Datagram congestion control [42] A. S. Tan, A. Aksay, C. Bilen, G. B. Akar, and E. Arikan, “Error re-
protocol (DCCP),” Internet Engineering Task Force, Mar. 2006, RFC silient layered stereoscopic video streaming,” presented at the 3DTV-
4340. CON, Kos Island, Greece, May 2007.
[19] S. Floyd and E. Kohler, “Profile for datagram congestion control pro- [43] L. Pang, M. Yu, G. Jiang, Z. Jiang, and F. Li, “An approach to error
tocol (DCCP) congestion control ID 2: TCP-like congestion control,” concealment for entire right frame loss in stereoscopic video transmis-
Internet Engineering Task Force, Mar. 2006, RFC 4341. sion,” in Proc. Int. Conf. Comput. Intell. Security, Guangzhou, China,
[20] S. Floyd, E. Kohler, and J. Padhye, “Profile for DCCP congestion con- Nov. 2006, pp. 1665–1670.
trol ID 3: TCP-friendly rate control (TFRC),” Internet Engineering [44] S. Knorr, C. Clemens, M. Kunter, and T. Sikora, “Robust concealment
Task Force, Mar. 2006, RFC 4342. for erroneous block bursts in stereoscopic images,” in Proc. 2nd Int.
[21] A. Smolic, P. Merkle, K. Müller, C. Fehn, P. Kauff, and T. Wiegand, Symp. 3-D Data Process., Visual., Transmiss. (3DPVT), Thessaloniki,
“Compression of multiview video and associated data,” in Three Di- Greece, Sep. 2004, pp. 820–827.
mensional Television: Capture, Transmission, and Display, H. Ozaktas [45] C. Clemens, M. Kunter, S. Knorr, and T. Sikora, “A hybrid approach
and L. Onural, Eds. New York: Springer-Verlag, 2007. for error concealment in stereoscopic images,” presented at the 5th Int.
[22] Advanced Video Coding for Generic Audiovisual Services, ITU-T Workshop Image Anal. Multimedia Interactive Services (WIAMIS),
Rec. H.264, Mar. 2005 [Online]. Available: http://www.itu.int/rec/T- Lisboa, Portugal, Apr. 2004.
REC-H.264, Available [46] A. Vetro, W. Matusik, H. Pfister, and J. Xin, “Coding approaches for
[23] JM Reference Software. ver. JM 11.0, Aug. 2006 [Online]. Available: end-to-end 3-D TV systems,” presented at the Picture Coding Symp.
http://iphome.hhi.de/suehring/tml (PCS), San Francisco, CA, Dec. 2004.
[24] Scalable Video Coding—Working Draft 1, Joint Video Team of ITU-T [47] G. Petrovic and P. H. N. de With, “Framework for layered 3-D video
VCEG and ISO/IEC MPEG, Doc. JVT-N020, Jan. 2005. streaming,” in Proc. 27th Symp. Inf. Theory Benelux, Noordwijk, The
[25] Joint Scalable Video Model JSVM-4, Joint Video Team of ITU-T Netherlands, Jun. 2006, pp. 53–60.
VCEG and ISO/IEC MPEG, Doc. JVT-Q202, Oct. 2005. [48] S. Pehlivan, A. Aksay, C. Bilen, G. B. Akar, and R. Civanlar,
[26] Joint Multiview Video Model (JMVM) 1.0, Joint Video Team of ITU-T “End-to-end stereoscopic video streaming system,” in Proc. IEEE Int.
VCEG and ISO/IEC MPEG, Doc. JVT-T208, Jul. 2006. Conf. Multimedia Expo. (ICME), Toronto, ON, Canada, Jul. 2006, pp.
[27] N. Ozbek and A. M. Tekalp, “Scalable multiview video coding for in- 2169–2172.
teractive 3DTV,” in Proc. IEEE Int. Conf. Multimedia Expo. (ICME), [49] P. Kauff, O. Schreer, and R. Tanger, “A mixed reality approach for im-
Toronto, ON, Canada, Jul. 2006, pp. 213–216. mersive tele-collaboration,” in Proc. Int. Workshop Immersive Telep-
[28] M. Drose, C. Clemens, and T. Sikora, “Extending single-view scalable resence, Juan Le Pins, France, Dec. 2002, pp. 1–4.
video coding to multiview based on H.264/AVC,” in Proc. IEEE Int. [50] Y. Wang, S. Wenger, J. Wen, and A. Katsaggelos, “Error resilient video
Conf. Image Process. (ICIP), Atlanta, GA, Oct. 2006, pp. 2977–2980. coding techniques,” IEEE Signal Process. Mag., vol. 17, no. 4, pp.
[29] A. Smolic, K. Müller, P. Merkle, C. Fehn, P. Kauff, P. Eisert, and T. 61–82, Jul. 2000.
Wiegand, “3D Video and free viewpoint video—technologies, applica- [51] Z. Tan and A. Zakhor, “Error control for video multicast using hierar-
tions and MPEG standards,” in Proc. IEEE Int. Conf. Multimedia Expo chical FEC,” in Proc. Int. Conf. Image Process. (ICIP), Kobe, Japan,
(ICME), Toronto, ON, Canada, Jul. 2006, pp. 2161–2164. Oct. 1999, pp. 401–405.
9. 1630 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 17, NO. 11, NOVEMBER 2007
[52] M. Luby, “LT codes,” in Proc. 43rd Annu. IEEE Symp. Foundations received the TUBITAK Science Award (highest scientific award in Turkey) in
Comput. Sci. (FOCS), Nov. 2002, pp. 271–282. 2004. He chaired the IEEE Signal Processing Society Technical Committee
[53] C. Fehn, “Depth-image-based rendering (DIBR), compression, and on Image and Multidimensional Signal Processing (January 1996–December
transmission for a flexible approach on 3DTV,” Ph.D. dissertation, 1997). He served as an Associate Editor for the IEEE TRANSACTIONS ON
Technical Univ., Berlin, Berlin, Germany, 2006. SIGNAL PROCESSING (1990–1992) and the IEEE TRANSACTIONS ON IMAGE
[54] J. van der Meer and A. Bourge, Carriage of Auxiliary Video Data ISO/ PROCESSING (1994–1996), and Multidimensional Systems and Signal Pro-
IECJTC 1/SC 29/ WG 11. FPDAM of ISO/IEC 13818-1:200X/AMD cessing (1994–2002). He was an Area Editor for Graphical Models and
2, WG 11 Doc. N8094, Jul. 2006. Image Processing (1995–1998). He was also on the Editorial Board of Visual
[55] A. Bourge and C. Fehn, Representation of Auxiliary Video and Supple- Communication and Image Representation (1995–2002). He was appointed
mental Information ISO/IEC JTC 1/SC 29/WG 11. Study of ISO/IEC as the Special Sessions Chair for the 1995 IEEE International Conference on
FCD 23002-3, WG 11 Doc. N8482, Oct. 2006. Image Processing, the Technical Program Co-Chair for IEEE ICASSP 2000
[56] L. B. Stelmach, W. J. Tam, D. Meegan, and A. Vincent, “Stereo image in Istanbul, the General Chair of IEEE International Conference on Image
quality: Effects of mixed spatio-temporal resolution,” IEEE Trans. Cir- Processing (ICIP) in Rochester in 2002, and Technical Program Co-Chair of
cuits Syst. Video Technol., vol. 10, no. 2, pp. 188–193, Mar. 2000. EUSIPCO 2005 in Antalya, Turkey. He is the Founder and First Chairman of
[57] Y. Xu and Y. Zhou, “H.264 video communication based refined error the Rochester Chapter of the IEEE Signal Processing Society. He was elected
concealment schemes,” IEEE Trans. Consum. Electron., vol. 50, no. 2, as the Chair of the Rochester Section of IEEE for 1994 to 1995. At present, he
pp. 1135–1141, Nov. 2004. is the Editor-in-Chief of EURASIP Signal Procesing: Image Communications.
[58] D. Agrafiotis, D. R. Bull, and C. N. Canagarajah, “Enhanced error He is serving as the Chairman of the Electronics and Informatics Group
concealment with mode selection,” IEEE Trans. Circuits Syst. Video of the Turkish Science and Technology Foundation (TUBITAK) and as an
Technol., vol. 16, no. 8, pp. 960–973, Aug. 2006.
independent expert to review projects for the European Commission.
[59] S. Belfiore, M. Grangetto, E. Magli, and G. Olmo, “An error conceal-
ment algorithm for streaming video,” in Proc. ICIP, 2003, pp. 649–652.
[60] P. Baccichet and A. Chimienti, “A low complexity concealment algo-
rithm for the whole-frame loss in H.264/AVC,” presented at the MMSP,
2004.
[61] Q. Peng, T. W. Yang, and C. Q. Zhu, “Block-based temporal error con- Christoph Fehn (M’99) received the Dipl.-Ing.
cealment for video packet using motion vector extrapolation,” in Proc. degree from the University of Dortmund, Dortmund,
IEEE Commun., Circuits Syst. West Sino Expo., 2002, pp. 10–14. Germany, in 1998, and the Ph.D. degree from
[62] Y. Chen, K. Yu, J. Li, and S. Li, “An error concealment algorithm theTechnical University Berlin, Germany, for his
for entire frame loss in video transmission,” presented at the Picture work on a novel approach for 3DTV, in 2006.
Coding Symp., San Francisco, CA, Dec. 2004. After receiving his Dipl.-Ing. degree, he joined the
Fraunhofer Institute for Telecommunications, Hein-
rich-Hertz-Institut (HHI), Berlin, Germany, where he
Gozde B. Akar (S’86–M’94–SM’98) received the is currently working as a scientific project manager
B.S. degree from Middle East Technical University, and coordinator for projects in the area of immersive
Ankara, Turkey, in 1988 and the M.S. and Ph.D. media, three-dimensional television (3DTV), and
degrees from Bilkent University, Bilkent, Turkey, digital cinema. He has been an active contributor to the ISO MPEG group,
in 1990 and 1994, respectively, all in electrical and where he is acting as an Associate Editor of the new MPEG-C Part 3 standard,
electronics engineering. which is concerned with the definition of encoding and transport mechanisms
She was with the University of Rochester, for depth/parallax data or other auxiliary video data representations.
Rochester, NY, Center of Electronic Imaging Sys- Dr. Fehn is a member of VDE and SMPTE.
tems as a Visiting Research Associate from 1994 to
1996. From 1996 to 1998, she worked as a Member
of Research and Technical Staff at Xerox Corpo-
ration—Digital Imaging Technology Center, Rochester. From 1998 to 1999,
she was with Baskent University, Department of Electrical and Electronics M. Reha Civanlar (F’05) received the B.S. and
Engineering. During the summer of 1999, she worked as a Visiting Researcher M.S. degrees in electrical engineering from Middle
at the Multimedia Laboratories, New Jersey Institute of Technology. Currently, East Technical University (METU), Ankara, Turkey,
she is an Associate Professor with the Department of Electrical and Electronics and the Ph.D. degree in electrical and computer
Engineering, Middle East Technical University, Ankara, Turkey. Her research engineering in 1984 from North Carolina State
interests are in face recognition, 2-D and 3-D video compression, and multi- University (NCSU), Raleigh.
media streaming. He is a Vice President and Director of the Media
Dr. Akar is an Editor of the EURASIP Journal on Signal Processing: Image Laboratory in DoCoMo USA Labs, Palo Alto, CA.
Communication. He was a Visiting Professor of Computer Engi-
neering at Koc University, Istanbul, Turkey, from
2002 to 2006. He was also leading a multinational
European Research Project on 3-D TV transport, and participating in numerous
A. Murat Tekalp (S’80–M’84–SM’91–F’03) re- Turkish industrial boards. He is serving at the advisory boards of Argela Tech-
ceived the M.S. and Ph.D. degrees in electrical, nologies Inc., on 3G multimedia systems and Layered Media Inc., on multipoint
computer, and systems engineering from Rensselaer videoconferencing. Before these, he was the Head of Visual Communications
Polytechnic Institute (RPI), Troy, NY, in 1982 and Research Department at AT&T Labs-Research starting from 1991. In the
1984, respectively. same department, he also held Technology Consultant and Technology Leader
He was with Eastman Kodak Company, Rochester, positions before heading the group. Prior to that, he was at Pixel Machines
NY, from December 1984 to June 1987, and with Department, Bell Laboratories, where he worked on parallel architectures and
the University of Rochester, Rochester, NY, from algorithms for image and volume processing and scientific visualization. His
July 1987 to June 2005, where he was promoted career started as a researcher in the Center for Communications and Signal
to Distinguished University Professor. Since June Processing of NCSU, where he worked on image processing. He has numerous
2001, he has been a Professor at Koç University, publications, several key contributions to the international multimedia com-
Istanbul, Turkey. His research interests are in the area of digital image and munications standards, and over 40 patents either granted or pending. His
video processing, including video compression and streaming, motion-com- current research interests include packet video systems, networked video and
pensated video filtering for high-resolution, video segmentation, content-based multimedia applications with particular emphasis on the Internet and wireless
video analysis and summarization, 3DTV/video processing and compression, systems, video coding, 3DTV and digital data transmissions.
multicamera surveillance video processing, and protection of digital content. Dr. Civanlar is a recipient of 1985 Senior Award of the ASSP Society of IEEE.
He authored the book Digital Video Processing (Prentice-Hall, 1995) and holds He served as an editor for the IEEE TRANSACTIONS ON COMMUNICATIONS, the
seven U.S. patents. His group contributed technology to the ISO/IEC MPEG-4 IEEE TRANSACTIONS ON MULTIMEDIA, and the Journal of Applied Signal Pro-
and MPEG-7 standards. cessing. He is currently an Editor for Image Communications. He served as a
Dr. Tekalp was named Distinguished Lecturer by the IEEE Signal Processing member of MMSP and MDSP Technical committees of the Signal Processing
Society in 1998, and awarded a Fulbright Senior Scholarship in 1999. He Society of IEEE. He is a Fulbright scholar and a member of Sigma Xi.