Video, audio, and still-image media interchange presents a bewildering assortment of combinations and permutations to design engineers. Dozens of media formats, codecs, transmission protocols, and display technologies must somehow be woven into what seems to the consumer to be a seamless, simple system.
How to get all these disparate technologies to work together is a big challenge. It is an ambitious goal – but one that is well on its way to being achieved.
In today’s digital world, setting common interchange content formats and common network protocols is not sufficient. Digital Rights Management (DRM) must also be included because embedded DRM systems can prevent commercial premium content from being illegally copied, listened to or viewed, as required by the commercial content owners.
Given the level of complexity and widely available standards, getting devices to talk together is not so much a matter of creating new standards as it is of cooperation between leading companies in the PC, CE, and mobile markets.
Communication – not convergence
In the past, the electronics industry has used convergence to describe a digital home in which content from numerous sources is available to the consumer. Over time, however, convergence has also become associated with literally merging electronics equipment into a single, all-powerful device. In some scenarios it was a PC; in others, a media player; in still others, a set-top box. For a number of reasons, this vision of the digital home has not come to pass.
But that’s OK because all that consumers really want is for all of their electronics gear to work better together. The Digital Living Network Alliance (DLNA) was formed in 2003 to take that approach. Its first set of baseline design guidelines - version 1.0 - was introduced in June of 2004.
An organization with more than 200 members, including virtually all of the global brands in PC, CE and mobile electronics, the DLNA is pursing a lowest-common-denominator strategy. Member companies commit themselves to executing a selected number of already common and widely deployed formats, protocols and codecs in all new equipment.
DLNA’s initial 2004 v1.0 interoperability guidelines set a baseline for sharing digital content across a broad range of PC and consumer devices by agreeing to a set of core requirements and providing details on how they should be implemented. Drawn from PC and Internet standards, they include support for wired Ethernet and wireless LAN, IPv4, Universal Plug and Play (UPnP), and JPEG, LPCM and MPEG-2 as the baseline image, audio and video formats.
Optimizing performance
This brings up the matter of performance. When two devices both with advanced compression algorithms link up, for example, defaulting to a baseline spec means a big performance hit.
Video is a good example. When broadcasting of MPEG-2 video began in 1993, most content being broadcast required bandwidths in the range of 6 to 8 Mbits/s. High motion content such as sports like basketball and football, which require a lot of panning and scanning of the camera, needed almost the full maximum bit rate allowed for MPEG-2 main profile at main level (MP@ML) which is 15 Mbits/s,. Compression algorithms improved over time until most of the content for broadcast quality MPEG-2 could be contained within the lower 2 to upper 5 Mbits/s. However with the newer compression standards such as H.264 (MPEG-4 AVC or MPEG-4 Part10) and Microsoft’s VC-1, they provide a more sophisticated tool suite to further reduce the bitrate by more than a factor of two over the older MPEG-2 offering. Hence this allows broadcast quality content to be distributed within the home at sub 1 MBits/s which is well within the available bandwidth of devices in the home used for rendering the content.
From a hardware perspective, the early MPEG-2 broadcast encoders were implemented with 12 to 13 dedicated hardware ASICs. Today this is done with one or two devices, many of them using programmable devices in lieu of the earlier hardware only implementations. In addition, the processing power of programmable devices, such as TI DSPs, has increased as a result of faster clock speeds and more advanced parallel architectures, allowing them to be used to implement the much more computationally intensive advanced compression algorithms such as H.264 and VC-1. Today, broadcast quality implementations of standard definition video can be compressed with one to four programmable devices, the actual number of devices depending on the desired level of quality, the complexity of the algorithm, and the profile/level of the compression standard being implemented. .
But lowering the bitrate to fall below the maximum channel capacity and bandwidth is not the only consideration at play here. In some instances, high bit rates could mean severely degraded quality depending on the transport medium. Connections over an 802.11 WLAN, for example, are heavily dependent on distance. Sustainable bit rate drops off precipitously with the distance between sender and receiver. A 0.5 Mbits/s bandwidth requirement simply means service that high quality video will be throughout the home.
Subsequent versions of the DLNA interoperability specification address the performance issue by offering a number of optional standards. If two devices discover that each is MPEG-4 capable, for example, no transcoding to MPEG-2 will occur. Optional standards include GIF, PNG, and TIF images, MP3, Windows Media Audio, AC-3, AAC and ATRAC3, MPEG-4 Part 2 audio, H.264 (MPEG-4 AVC or MPEG-4 Part 10), and Microsoft’s VC-1 video formats.
Universal Plug and Play
Earlier attempts at device interoperability have fallen short of the mark because they did not address a baseline set of requirements each device must support. However, one DLNA precursor – Universal Plug and Play (UPnP) – has broad support already and – along with DLNA – is a critical piece of the solution to the interoperability puzzle.
UPnP enables self-configuration and self-discovery between devices. Devices announce their capabilities and options without any user intervention. The specific mechanisms are: automatic address configuration, device discovery, command and control, event generation, and presentation for viewing device status and control.
UPnP runs on top of the IP network layer and utilizes standards such as UDP, TCP, HTTP, XML, GENA, and SOAP. UPnP’s audio/video architecture consists of the following devices:
Figure 1 illustrates the basic UPnP architecture.
Figure 1. Basic UPnP architecture
DLNA baseline requirements
DNLA picks up where UPnP left off – by defining baseline design guidelines. To keep its interoperability specification consumer focused, DLNA derives its design guidelines from carefully thought-out use cases and usage scenarios. After collecting a wide range of scenarios, DLNA sorts them into “immediate”, “next-version”, and “future” categories.
Use scenarios were analyzed for common elements and consistent features. The highest priority use cases were simplified by removing all non-essential details. The resulting guidelines deliver all the functionality needed with a relatively small set of device classes and function/capability categories.
In DLNA design guidelines V1.0, devices fall into two general groups, Digital Media Servers (DMS) and Digital Media Players (DMP).
DMS devices source, acquire, record and store media. They usually have rendering capability and they may have intelligence, such as device and user services management, rich user interfaces, media management, aggregation and distribution functions.
Some examples include:
DMP Devices let users to select and play the digital media stored in the home network. Examples include:
(Use Categories, continued on next page)
DLNA has also defined four primary use categories:
Version 1.0 of the guidelines covers only the first use category. The remaining three will be covered by Version 1.5 guidelines, slated for being released later this year.
An example of a possible combination of devices and use categories is shown in Figure 2.
Figure 2. An example of DLNA’s two-box pull use category
In this case, the four devices are a DMP (digital TV) and two PCs and a laptop that have pictures stored on them (DMSs). The TV’s remote control mediates the interaction between the TV and the PCs. The use category is a two-box pull because the TV’s remote is used to find and select pictures and then pull them to the TV. Because the source of the pictures is accessed from the viewing device, the action is called a “pull” based on the HTTP-get protocol. If the user interface had been on the PC or laptop then the action of sending the pictures to the TV would be a “push”. The push model will be an optional feature in version 1.5 guidelines based on RTP.
Interoperability framework
Describing devices and use scenarios are important first steps but they are only half the interoperability battle. A framework and protocol are needed to actually send video or other media from one device to another.
The DLNA’s guidelines for its initial version (V.1.0) have organized interoperability requirements into five major categories:
Figure 3 shows not only the baselines for V.1.0 but projections for presently unsupported features such as QoS and digital rights management. Each guideline entry lists the device classes that it applies to, making it easy for device developers to identify mandatory and optional interoperability features.
V1.5. guidelines will include mobile handheld device guidelines with Bluetooth connectivity as well as add upload/download, RTP, WMM priority-based QoS, play lists and new device classes such as networked media controllers (DMC), network controllable media renderers (DMR), printers (DMPr) and mobile handheld device classes.
Figure 3. DLNA interoperability framework
Into the future
Although DLNA version 1.0 have made its first step toward making the dream of a networked home with device interoperability a reality, there is still plenty of work to be done. Progress is required on four fronts: ease of setup and use; digital rights management; network security and quality of service.
In a converged CE, PC and mobile handheld digital home, a programmable TI DSP is a perfect device to do encode, decode, transcode and DRM transcription to ensure format and DRM interoperability. Coupled with ease of setup, QoS and security, the intended A/V content can be securely and seamlessly streamed in the home network to the intended device(s) in order to achieve an unprecedented user experience.
Implementation considerations
The number of codecs, standards and transport methods that a system of near-universal connectivity must support begs the question: How is such a system be best implemented?
The codec explosion began in early 1990s with MPEG2 and broadcast digital STBs. Computing power was all important and the most cost-effective implementations at the time were hardware based fixed function ASICs with a very specific targeted level of functionality.
The widespread utilization of the Internet in the late 1990s brought with it many alternative codecs, protocols and standards. As result, the fact that ASICs support only a subset of possible features became a major liability.
Moreover, as Internet applications continue to grow and diversify, systems will need even more versatility. Software-programmable solutions are the obvious alternative and advances in process technology enable single and multi-processor based software solutions. Software-programmable solutions also make upgrades adapting to modifications of standards easier.
With the emphasis in most codecs on signal-processing capability, DLNA and UPnP applications are ideally suited for DSP solutions. In addition to their processing capabilities, recent generations of DSPs have been optimized to meet low power requirements in a variety of applications as shown in the Figure 4.
Figure 4. DSPs address a wide spectrum of DLNA applications.
As the consumer, communications and computer industries enter a new era of connectivity and compatibility, DSP technology appears destined to play a pivotal role.
About the Authors:
Tim W. Simerly has been with TI for over three years and is the lead System Architect for the company’s streaming media solutions based on the DM642 digital media processor. Tim has spent most his career as a R&D project manager with Rockwell International developing and building high performance missile front-ends using visual CCD and IR FPA image sensors.
In 1995, he joined Scientific Atlanta and led the hardware/software development of a prototype video conferencing camera as an appliance for the set top box and was on the architecture/development team for Scientific Atlanta’s next generation set top boxes. In October of 1998, he joined IVEX where he served as architect, manager, director, and principal scientist of a startup developing smart network based camera products for the surveillance and security market. IVEX was acquired by Axcess Incorporated in September of 2001 where he served as Vice President of Video Engineering overseeing the development of their network based DSP products for streaming video and audio for the security industry.
He holds a BS (summa cum laude) and MS degree in electrical engineering from Georgia Tech, an MS in systems analysis from the University of West Florida and an MBA from Georgia State University.
Joseph Chou is the Director of Technical Marketing for TI’s DSP Streaming Media Group, where he is responsible for the strategic and technical direction of video applications for multimedia content in IP-STB and Digital Media Adaptor (DMA) market. Chou has spent the last 20 years working in the computer, communications and semiconductor industries, and currently holds several patents in computer and communication systems.
Chou received his bachelor’s degree in electronic engineering from Chao Tung University in Taiwan, and an MBA from the University of Dallas.