-->
Save your FREE seat for 流媒体 Connect this August. 现在注册!

How to Build a Scalable Low-Latency Streaming Solution

文章特色图片

In this article I’ll look at low-latency live streaming at scale, and the standards and tools that make it possible. 作为一名玩家开发者, 我对低延迟直播和交互性的概念特别感兴趣. 太久了, 我们将流媒体播放器限制在与VCR相同的控制:播放, 暂停, 倒带, 快进, 等等....... 但低延迟和交互性扩展了视频播放器的意义.

图1(下面) 展示了一些融入互动元素的玩家例子. 用潜望镜, 你有一个想法,在他们的手机上直播,他们的粉丝观看并点击小心形按钮. The streamer sees that and can react to that feedback. With Twitch, you have a video game streamer. Their fans are chatting along with the stream, the gamer can respond to what’s happening in the chat, and the people watching get that feedback. HQ是一款很受欢迎的问答应用,在那里你会有一个主持人来提问, and people in the audience are responding, answering those questions and the host is reading off the results. You need that kind of quick feedback with  Q & A.

低延迟流媒体应用

图1. 交互式流媒体应用 

When Low Latency Matters (and Why)

Traditional broadcast media doesn’t require this kind of interactivity. 对于像超级碗或奥运会这样的比赛,这更像是一种背靠椅背的体验. 你可能有一个推特,但它意味着坐下来. You’re not interacting with the people on the field or in a newscast. You’re just consuming the stream.

当然, latency can detract from these experiences as well, 当你在家看比赛时,在你自己看到进球前10秒,你听到你的邻居为你的进球欢呼. 这破坏了气氛. That’s a real issue with broadcast media, 但我认为延迟并不像交互式视频那样是个大问题.

With an event like the Super Bowl or the World Cup, 制作方打算在流媒体离开会场之前引入多达30秒的延迟. 他们在那里会有一些延迟,这样如果评论员开始偏离轨道并滔滔不绝地说一些疯狂的话,导演就可以切到一个商业广告.

In some scenarios, low latency isn’t worth the trouble. 以获得更低的延迟, 不管你在做什么, you’re introducing some level of cost and instability in the network, getting from the venue to the player. 在大型活动中,制作方不愿冒这么大的风险或承担额外的成本. 你会选择一些你知道是稳定的、持续的、可靠的东西.

最后,流媒体观众比延迟更讨厌的是重新缓冲. 我们都有过这样的经历, 当一名球员准备射门时,他必须重新缓冲.

大多数时候, 当这种情况发生时, 更低延迟的概念正在向链中引入更多的再缓冲. 减少延迟降低了玩家为了保护自己免受进一步的重新缓冲而保留的缓冲区. 当你这么做的时候, 你很有可能会向更广泛的受众介绍更多的“再缓冲”.

没有人会尝试这样的低延迟流,除非他们确信不会通过降低延迟来引入更多的再缓冲. Avoiding rebuffering is a higher priority.

也, when I say “interactive live streaming,” I’m not talking about real-time, two-way audio communication apps like Google Hangouts, Skype, 或放大. With these, you’re aiming for 0.3秒或更短的延迟. 你会得到大于0的值.在这种情况下只有3秒,说话者开始互相交谈.

有了互动直播,我们不会把Google Hangouts的房间扩大到1000人. 它不是用来做这个的. 我们在说什么, 低延迟直播, 有规模, 这很有趣, because it falls right between these two things, and kind of picks up the challenges of both sides.

Building an Interactive Streaming App

当构建涉及查看器交互和低延迟流的应用程序时, 你需要一些东西. 首先, you need a real-time data framework, 基于成熟的技术,如web sockets和WebRTC数据通道. Lots of readily available proprietary services such as Firebase, 推杆式, 或者HubNub可以帮助你构建一个应用程序,它足够快速和可靠,当参与者发送聊天消息时, other viewers can see it almost right away. Today, real-time data is a relatively solved problem.

在另一边, we have the low-latency video at scale, for which we don't really have a clear-cut solution, although viable standards and solutions are starting to emerge.

How Low Is Low Latency for Streaming Video?

中所示的图表 图2(下面) illustrates what constitutes low latency for video. The left-hand side shows high latency, 30-60 seconds. It’s relatively common, especially on the web or iOS devices. When Apple first introduced HLS, 他们建议你首先把视频剪成10秒的片段, 玩家应该在游戏开始前缓冲3秒. So, three seconds times 10 seconds equals 30 seconds. That's just in the player itself, 这还不包括从镜头到玩家的延迟.

 Low-latency streaming spectrum

图2. Live streaming latency spectrum

在过去的几年里, 人们已经开始将片段的长度减少到2-6秒,以便玩家更快地开始游戏,并将延迟减少到6-18秒. As mentioned earlier, you introduce a little more rebuffering. But if you take this approach and your player and network are reliable, low latency becomes easier to achieve. 

有很多低延迟的一秒片段直播流的演示. 问题是,这些演示在演示环境之外并不总是能很好地工作. Once you get to real-world networks, an issue arises when you’re making a request for data every second. The requests themselves can have as much as half a second of overhead; even just starting to receive the data for those segments can have overhead. 所以你实际上是在向那些请求中引入空气,这将使它更难跟上这股流, and harder to not enter a rebuffering state.

When we go down to one-second segments at scale, they start to fall over a lot more quickly, especially on mobile networks and more difficult networks. I don't recommend trying to go down to that. You’re certainly welcome to test it, 但是当你在看演示展示如何通过一秒钟的片段达到这个水平, just be wary that in production, 这可能不太管用.

对于两秒的片段,我们可以开始了解低延迟的概念. 如果你把时间缩短到6秒——假设你从镜头到玩家的延迟时间少于4秒——那么你就在10秒的范围内了. These numbers aren’t perfect, but they work in the real world. Recently, I spoke with a developer who is building a Twitch clone. 他们在窗口旁边聊天,今天他们对10秒钟的片段很满意. 当然, no latency would be preferable, 但对于所涉及的交互类型和与聊天的反应能力, sub-10 seconds is what people, 根据我的经验, 我要求.

When Twitch first got started, they had as much as 20-second segments, 用户开始习惯从发起聊天到收到回复之间有20秒的延迟. Now Twitch is down to the sub-six second range, and expectations might change as people get more used to that.

但是今天, when I talk to developers building these applications, knowing that the challenge is to get down to lower latency, I hear people asking for 10 seconds for those applications. When you get into something like an HQ trivia app, 主持人的主要工作就是回应观众的反应, that’s when I hear requests for four seconds or less.

As for sub-second latency, that is really the realm of WebRTC. RTMP also fits into this area. These protocols are great for enabling real-time audio communication, but they’re relatively expensive to scale. 相比之下, HTTP——包括图2中亚秒级以下的所有内容——扩展起来很便宜, but makes it difficult to get down to lower latency.

In this ultra-low latency range, 两种不同的机制正在竞争成为每个人在扩展WebRTC时都使用的低延迟协议, using a bunch of media servers, 而不是试图找出如何以某种方式破解清单,以便尽快将这些大块内容呈现给玩家.

流媒体覆盖
免费的
合资格订户
现在就订阅 最新一期 过去的问题