Why attend?
-
No marketing, ever.
Speakers are selected based on their submission, not how much money their company paid; we will never, ever sell a speaking slot. Attendee information isn't for sale either, and that includes any sponsors.
-
Affordable.
We want anyone in the industry to be able to come, which means keeping tickets reasonably priced (thanks largely to our generous sponsors). We also offer free and discounted tickets to students and open source contributors, so please reach out if you're interested.
-
For everyone in the community.
Our community is dedicated to providing an inclusive, enjoyable experience for everyone in the video industry. In this pursuit, and in keeping with our love for reasonable standards, we adopted the Ada Initiative's code of conduct.


Venue and Location
Bespoke
845 Market St, Suite 450, San Francisco, CA 94103
Bespoke is a large, configurable, high-tech conference space right in the heart of downtown San Francisco. It is easily accessible via BART and Muni. Bespoke is located inside the Westfield San Francisco Centre mall on level 4, next to Bloomingdale’s Westfield San Francisco Centre.
Bespoke also played host to Demuxed in 2018, but we promise the chairs are much better this year!>
-
Nidhi Kulkarni
Mux
Just how live is your mobile live stream anyway?
How can you measure the latency of live video from the time it is ingested to when it plays back on a viewer’s device? In this talk, we describe how we calculate this metric using a methodology and heuristic that work on a variety of platforms, focusing on Android and iOS devices.-
Alex Converse
Twitch
RTMP done right: A roofshot
A great man once said we choose not to go to the roof because it looks good in your promo packet. We go to the roof because it's right there.There's a ton of moonshot tech (from WebRTC to SRT and RIST to the new IETF media over QUIC) that seems well positioned to revolutionize live video ingestion in the coming years.
But RTMP is still deeply entrenched in many platforms and tools. What can we do with RTMP right now to make video ingestion work better?
This talk will walk through how to get better performance out of RTMP by better managing a TCP connection for real-time data and by better use of tools that have been in the RTMP spec this whole time.
As they say in the movie Hackers, "there's an Olympic-sized swimming pool on the roof." Bring your suit.
-
Alex Zambelli
Warner Bros. Discovery
It's Time We Said Goodbye To Fractional Framerates
23.976, 29.97, 59.94. Nearly all of us working in video tech have encountered these oddly specific numbers more than once in our careers and can instantly recognize them as the digital video framerates most commonly used in North America (and a few other parts of the world). Million lines of code have been written by video engineers over the past decades to handle these framerates and the challenges that inevitably arise when trying to count whole frames as fractional values. So why do we go through all this trouble? Where did these strange framerates come from in the first place? Why did they prevail against saner framerates like 24, 30 and 60?In my Demuxed talk I will dive into the history of video framerates, explain how technical design choices made by people in lab coats 70 years ago still haunt us even in the digital age - and then explain why we, the greatest streaming nerds who ever attended a Demuxed conference, have a golden opportunity to lead the entire video industry into the 21st century by embracing integer framerates and saying goodbye to unnecessarily dividing perfectly nice whole numbers with 1.001. There may even be a few FFmpeg demos to help jumpstart the revolution.
-
Ali C. Begen
Ozyegin University / Comcast
Catching the Moment in Low-Latency Live Streaming
Bandwidth prediction, which is already a difficult task, must be more accurate when lower latency is desired due to the shorter time available to react to bandwidth changes. Any inaccuracy in bandwidth prediction results in flawed rate-adaptation decisions, which will, in turn, translate into a diminished viewer experience. In this talk, we present several bandwidth prediction models (based on statistical and computational intelligence techniques) optimized for low latency, a rate-adaptation scheme (both heuristic and learning-based) and a playback speed control scheme to completely overhaul low-latency live streaming clients. The public (open) source code is available, too.-
Amy Rice
syd<video>
One Manifest to rule them all…
CMAF as a standard has been a game changer for Encoding/Origin/CDN workflows and, like many video teams working on major streaming platforms, we were excited about the promise of finally getting closer to having a single format… HOWEVER, this is not ‘just another CMAF as a single format’ talk…Add SSAI to the mix and there are still the same challenges to solve which got us wondering… Could CMAF be about more than sharing the same video segments? What if, with the three major DRMs aligning on CBCS, we are able to work on a single (simpler) HLS manifest for all platforms removing the complications we have with dash of Multiple Period player support and publishing multiple $number/$time outputs?
With the myriad of devices out there, can we really get to one manifest to rule them all?
-
Christian Feldmann
Bitmovin
The Art in the Video Codec
What is a video decoder? In essence it's a highly specialized painting engine that operates using special instructions which are packed into something we call ‘bitstream’. These instructions include things like “move block from A to B” or “paint directionally into that direction”. A normal encoder tries to use these instructions as efficiently as possible to “paint” a video that is as close to a reference video as possible. Everybody can do that. What if we use the tools that the decoder offers to modify, morph or filter images or even to paint directly with the tools? So with a modified custom encoder we can directly write a standard conform bitstream that produces really fascinating output. In this talk I want to reveal the art in the video codec but also teach a bit about the inner workings of video decoders that produce this art.-
Christoph Guttandin
Media Codings
Time for the Timing Object
Did you know that there is a standard to synchronize time sensitive media across devices? It's a draft. It's not implemented in any browser. But luckily that's not a huge problem. You can still use it today.The Timing Object specification is a W3C draft proposed by the Multi-Device Timing Community Group. The work on the spec started over 7 years ago. Sadly it went largely unnoticed. At that time there was not much interest in precisely synchronizing media in the same tab or across devices.
But today features like Apple's SharePlay or BBC's Together are well known and heavily used. The TimingObject (which is the core piece of the Timing Object specification) is what makes it easy to implement the same functionality on any website.
A TimingObject is very flexible and cannot only be used to synchronize media. You can synchronize anything that fits on a timeline. That could be timed advertisement, synchronized stats for a sports broadcast, subtitles, or a second screen app that shows relevant information which matches exactly what is shown on the main screen.
-
Cise Midoglu
Simula Research Laboratory
7 Things They Don’t Tell You About Streaming Analytics
The purpose of this talk is to debunk a number of streaming analytics myths, Ha-Joon Chang style.We discuss 7 "thing"s which address common misconceptions ranging from marketing ploys (*ahem*), such as how we are all empowering our video analytics pipelines with AI, to legitimate confusions, such as what makes an appropriate QoE representation for a given stakeholder and/or use case.
Thing 1: Streaming analytics is not only for debugging errors.
Thing 2: The video community is not actually using AI for streaming analytics.
Thing 3: What we talk about when we talk about QoE might be wildly different.
Thing 4: We do not have enough standards for streaming analytics.
Thing 5: Energy consumption should be a bigger concern for streaming analytics.
Thing 6: Open source software and open datasets can also benefit commercial streaming analytics products.
Thing 7: Streaming analytics is not ready for the metaverse.
We provide use cases, examples and lessons learned from both research and industry.
Bite-sized takeaways guaranteed.
-
Dan Jenkins
Broadcaster VC
Yup... WebRTC still sucks
Indiana Jones is a hero, right? Well, WebRTC is a hero too, right?Time after time, Indiana defeats evil and saves the day. Time after time, WebRTC saves the day as well, connecting people during the pandemic and bringing together technologies for new purposes. I'm here to tell you all is not what it seems, for WebRTC or Indiana.
-
Dan Sparacio
Paramount
Caption Me If You Can
Captions, subtitles, and forced narratives increase the reach of viewership, allowing more people access to more content. In this talk I will discuss the challenges and solutions our video team uncovered while implementing timed text features for a global audience at Paramount. I will focus on the team’s experiences with connected TVs, Set-Top Boxes and Game Consoles that support Media Source APIs. This talk will be from a javascript player perspective on multilingual text, forced narrative, navigating format and content preparation all while meeting FCC regulations in order to supply the best viewing experience for all users.-
Derek Buitenhuis
Vimeo
Are Video Codecs... Done?
Remember all those non-FAANG entities who deployed HEVC and VP9? Yeah, me neither. AV1? Eh. VVC? lol.Are newer, more complex codecs, even useful for anyone but megcorps? Have we reache The End, like we have for audio codecs, where only telcos are interested in lowering bandwidth more, to save money? Have we reached the point where the compute/bandwidth costs will never make sense again small/medium players, who are too big for hardware encoding, too small to reap the benefits of software encoding, and too small to invest in bespoke hardware?
Similarily, does anyone under the age of 50 work on codecs anymore? Have we made the barrier to entry so high that you need to spend 10 years banging your head against esoteric papers to understand everything in VVC? Are we all doomed to glue things together?
Are video codecs dead? This is an open question. Let's discuss.
-
Hojatollah Yeganeh
SSIMWave
What HDR format preserves creative intent better?
Tone mapping plays a pivotal role in rendering HDR content. No matter how expensive your HDR TV is or what format (HDR10, HLG, DolbyVision and HDR10+) is used to deliver your favourite show to your living room, performing tone mapping operation is inevitable as no consumer TV may accommodate 1000 or 4000 nits. Tone mapping almost always introduces structural detail loss to the creative intent unless the dynamic range of the content is not really high.In this talk Dr. Hojat Yeganeh will show a comparison between top HDR formats using structural fidelity maps and scores that provide deep insights into the performance of HDR formats in preserving creative intent.
-
Dylan Armajani
Akamai
Start Asking The Right Capacity Questions
Capacity planning is critical for any high bit-driving application…but not all capacity is the same. Capacity & QoE are intrinsically tied together. In this short presentation we’ll discuss the nuances of capacity planning that are often overlooked, the questions one should be asking their delivery partners, and why one should be skeptical of any 10,000-foot view figures put out by vendors.-
Emmanuel Papirakis
Amazon
When Whack-a-Mole Won't Work: Enhancing the Durability of Video Connections Over Anycast
In anycast, a collection of servers from around the world share the same IP address. Since packets are routed based on the destination IP, there are multiple valid routes that can get packets to where they need to go, potentially on different servers. Intuition tells us that the shortest route will be selected, and the outcome should be that every video connection will land on the server that is closest to the client. This is a good thing and should ensure the best possible latency.This approach is ideal for simple protocols where a single packet is needed for both the request and the response. But what about long lived video connections? What happens when the shortest and best path for the packets of a connection changes? The connection breaks, and the client needs to reconnect, reauthenticate and continue where they left off. In this session, we will discuss how using HTTP/3 and QUIC along with XDP and tunneling can be used to overcome this situation at planetary network scale.
-
Guillaume Bichot
Broadpeak
Low latency server side adaptive bitrate streaming supersedes client side algorithms
In ABR (adaptive bit rate) streaming, the player controls the quality (bit rate) selection function of its maximum available bandwidth estimation and based on the buffer level. The player has one objective: to maximize the quality of experience (QoE) perceived by the user while avoiding rebuffering. Bandwidth estimation performed by the player is usually based on HTTP (application layer), which does not work properly in some situations like with CMAF low latency. With the latter, a video segment is split into chunks that are coded and transmitted continuously in such a way that a segment is sent and received roughly at the video bit rate. Relying on the buffer level is not really feasible as with low latency the aim is to have the buffer as small as possible. In order to mitigate this problem, several player-side algorithms have been proposed and tested through a common test framework.We have adapted the test environment in order to test a server-side approach wherein the server decides about the video segment’s bitrate/quality relying on bandwidth estimation relying on the underlaying transport layer’s congestion control (e.g. BBR).
Several test campaigns have shown the merits of the server-side approach beating the best client-side algorithms.
In my presentation, I introduce the basics of the server side ABR approach (in particular how it fits with standard client-side approach such as MPEG DASH), I give details of the test bed and present the measurement results comparing with the best-known client-side algorithms.
-
Hadi Amirpour
Universität Klagenfurt
Live is Life: Per-Title Encoding for Live Video Workflows
While live video streaming is expected to continue growing at an accelerated pace, one potential area for optimization that has remained relatively untapped is the use of content-aware encoding to improve the quality of live contribution streams. For those who have tried, it has been quite challenging due to the additional latency that complexity analysis adds to the streaming framework. However, it merits further investigation, as optimizing live video coding workflows may result in (i) decreased storage or/and delivery costs and (ii) increased QoE.In this talk, we introduce a revolutionary video complexity metric that operates in real-time and enables the adoption of per-title encoding (among others) for live video workflows. We will explain and illustrate the key components of this metric and highlight the open source repository including how video developers can utilize it for their video workflows. Finally, we will present selected applications (i.e., per-title encoding among others) and showcase their performance compared to state-of-the-art techniques for live video workflows.
Video developers will gain a deep understanding of video complexity, how to extract it in real-time, and how to use it in live video workflows.
-
Jean-Baptiste Kempf
Videolan
Updates from the Open Source Multimedia community
One year after last demuxed, we will share updates from the open source multimedia community.We will notably speak about the releases of FFmpeg 5.0/5.1, what they contain as features and improvements; and compare them to what was scheduled. We'll notably talk about the API additions of the 5.1 release.
We'll then speak about the new features that are being worked on for FFmpeg 6.0 and the releases schedule.
Updates about dav1d, and notably with the version 1.0.0, will be shared too, with an emphasis on performance testing and what are the potential improvements to AV1 decoding that will come.
Finally, we'll speak about other open source libraries, like x264 and placebo and their updates for this year.
-
Joey Parrish
UpFish: Scripted Audio Manipulation for Streaming Media
In 2004, Brad Neeley released an alternate audio track to turn the first "Harry Potter" movie into "Wizard People, Dear Reader", a hilarious retelling in which Harry and friends drink and swear a lot. This off-color Riff-Tracks-esque release enjoyed many years of being passed around on CDs and DVDs, only to die out in the era of streaming as physical media became old-fashioned.UpFish is a magical, open-source browser extension that uses the WebAudio API to bring "Wizard People" back to life on the streaming site of your choice. Just stream "Harry Potter" and enable the "Wizard People" script in the extension, et voila! You get the original audio, karaoke'd on-the-fly to remove the actors' voices, and Brad Neeley's dulcet tones laid over the original score. Watch as Harry, The Wretched Harmony, and Ronnie the F'ing Bear battle sick-ass draculas to rule the school, then finish the evening with a nightcap. A 2-minute chapter to give you a taste: https://www.youtube.com/watch?v=Bb9_UCNnBTc
But UpFish is more than that! With an open scripting format for audio manipulation, you can publish your own Riff-Tracks-style soundtracks to anything you want.
In addition to the custom scripting options, the extension comes with a built-in script for generic karaoke filtering on anything, a script to turn "Harry Potter" into "Wizard People", and one more to hack the classic Alfred Hitchcock film "Vertigo" into "Fartigo", in which Jimmy Stewart has gas every time he gets scared... or happy... or bored.
Available now! https://upfish.fans/
-
Kieran Kunhya
Open Broadcast Systems
Live video at 150 miles per hour!
Motor racing is a technically challenging sport and getting onboard video from racing cars at 150 miles per hour is a massively complicated problem - heat, vibration, power, speed to say the least. This presentation will explain these challenges, how we solved them in order to obtain some of the most exciting and dramatic footage possible in sport and how 5G will significantly change the way onboard racing video can be produced. It will show how we have delivered reliable onboard video at several world-famous racetracks.-
Leon Lyakovetsky
Podium
Why Video Captioning Needs Built-In Viewer Feedback in 2022 (and How We Do It)
We've all heard the hype, excitement, and fear of how AI systems are getting smarter & smarter, developing sentience, and generally taking over the world creating a future of subjugation and despair for the human race. However, I am fairly confident that this bleak picture is not in our near future because there is one major problem: AI doesn’t even understand us that well.Anyone that has used voice recognition on their phone or in their car will recognize that speech-to-text technology still has a long way to go. In the video world, this is nowhere more obvious than in auto-generated video captioning.
While auto-generated captions are better than no captions at all - incorrect spellings, wrong words, bad punctuation, and misplaced phrasing breaks among other discrepancies mean that human review and improvement is still needed for captions to accurately represent what is said and heard in videos. (If you’re watching a video for any long length of time and are not noticing any errors, that is thanks to human review!)
Accuracy in captioning is not a trivial matter since captioning errors are not just a minor annoyance. ADA accessibility compliance demands 99% accurate captions, speaker labels, and phrase breaks among other features that none of the auto-generated captioning services on the market today meet.
Yet, most auto-generated caption errors can be improved by far more people than only costly transcribers.
That’s why I propose that while the speech recognition wizards keep improving their methods and services, it is on us video engineers to allow interested viewers, those who are already watching and interested in fixing errors they see, the chance to easily give feedback to improve transcriptions of both recorded and live video. The goal of this is to increase accuracy and watchability for fellow viewers while also giving the machines better and better data to keep on improving.
In this talk, I will give a brief review of current speech-to-text technology, where it is limited, and why it will be limited until completely new techniques come along. Then I will outline both high-level ideas and actionable steps for video developers to add more feedback systems into their video players. This includes a demo that proposes updates to video player UIs for viewers to be able to easily give feedback, a backend that handles the inputs of an open-ended crowdsourced system in a productive manner, and updates to the caption file formats we use to capture this feedback effectively.
One day in the future, every video will be captioned 100% correctly by automation. But until that day, its on us to incorporate simple feedback systems so that every video has the chance to be captioned correctly!
-
Marc Höppner
Akamai
A content owner, a CDN and a player walk into a bar.
Rebuffering. Since the 1950s, we've landed a man on the moon, sequenced the human genome, put a rover on Mars and developed self-driving cars. Why do we still have rebuffering in 2022? Content owners blame the CDN and the player, the CDN blames the content owner and the player, and the player blames the CDN and the content owner. Let’s stop the blame game – and have a mutual look at the data!We have analyzed massive amounts of data over a time period of more than 2 months. For each day, we have analyzed 15 billion client-to-edge requests for more than 10 million streaming sessions per day. We’ve focused on a VOD, multi-CDN setup with ad-insertion. Based on a combination of client-side quality of experience data (by Common Media Client Data) and server-side quality of service data, we have classified buffer starvation events. We categorize them into buckets, for where they could be prevented – 1) at the player, 2) at the last mile, 3) at the CDN/Origin or 4) at the workflow level. Leveraging the learnings from this unique data set enables us to significantly reduce end-users' rebuffering by taking advantage of the interplay of the player, the CDN and of the workflow.
A content owner, a cdn and a player walk into a bar, and the user has a great streaming experience. Mission accomplished.
-
Mario Guggenberger
Bitmovin
A journey from manual to fully automatic video player testing
Manually testing the functionality of a video player takes time, and when a growing set of features and supported platforms leads to an exploding demand for test executions that cause a bottleneck in your development process, it is time to reconsider.I will tell the story of how Bitmovin's test automation team incrementally transformed tedious manual testing into a fully-automated system that executes 150,000 player tests and customer stream conformance tests every day on a worldwide distributed fleet of desktop computers, smartphones, smart TVs, gaming consoles and streaming sticks. More specifically, I will discuss writing a test framework, handling test results and flaky tests, unified cross-platform execution, device selection, software and hardware automation, platform quirks, SDLC integration, remote device usage, getting through lockdowns without interruption, and how we expose the system even to our customers.
-
Mattias Buelens
THEO
Baby’s first HTML5 <video> element
An HTML5 player for HLS or MPEG-DASH uses the HTML5 <video> element and the Media Source Extensions API to buffer fragmented MP4 segments and play them out as a smooth video stream. But what actually goes on behind the scenes of these web APIs? What happens when you play or seek the video, and how does the video element maintain smooth playback without dropping frames?In this talk, I’ll use the WebCodecs and Canvas APIs as low-level building blocks to rebuild parts of the <video> and MSE APIs. Using that implementation, I’ll walk through some interesting bits of the playback and buffering logic, and explain what a streaming video player should look out for when using these APIs. Some of the topics that I’ll cover include:
* How seeking works, how the GOP size affects how fast you can seek, and why you should always put a keyframe at the start of each fMP4 segment.
* How quality switching works, and why you should try to align your segment boundaries across all qualities.
* How MSE manages the size of its buffer, and what a player can do to keep this size under control.
-
Nicolas Levy
Qualabs
-
Emil Santurio
Qualabs
How we "ScaleUP" the next generation of us: Learning video and a de-LIGHT-ful Player
In the last 12 months at Qualabs, we had to train 50 new video engineers, challenging us to systematize the way of training newcomers. To face this challenge we created "ScaleUp", a team designed to ramp-up technical video knowledge and teamwork using the methodology of "learning by doing"!Developing video engineers is a challenge that we all share in the industry! At Demuxed 2020 Alexandria Shealy and Kevin Fuhrman introduced this topic in the talk "Growing the Next Generation of Us". In this opportunity we will talk about how we have improved the way we develop our next generation and the lessons we have learned by doing this.
We will review various iterations, from "hey!, read this tutorial and tell me what you learned" to today, where the real learning comes from real projects stories, running custom-designed projects and feedback, feedback and more feedback.
In addition, we will present the De-LIGTH-ful Player!, an open-source player created by "ScaleUP" batch inspired by Phil Cluff's Smell-O-Vision from Demuxed 2021. We will see it as an example of how to design the learning path, how to use the "aha" moments, the methodology we use during the execution, and of course, a de-light-ful demo.
-
Peter Howard
Practical Applied Strategy
Video Remixing at the Edge
WebAssembly enabling compute at the edge is changing the way we design and deploy digital content and products, and will up-end lots of video delivery use cases.I argue that we should think of using edge computing as a remix layer, enabling new levels of personalisation to deliver more engaging video experiences. In this session I'll share some examples of how this works, and talk to the ways we need to think about designing content — and more particularly, content fragments — to enable these new experiences.
-
Peter Tseng
Eluvio
-
Qingyuan Liu
Eluvio
FFMARK - An Open Framework for Forensic Watermarking
How hard can it be to hide some bits in a video? Long story short, we started a free and open-source project: The Open Framework for Forensic Watermarking.In the beginning, we tested and compared various video watermarking techniques - including frequency domain algorithms and a machine learning model - based on metrics like performance, imperceptibility, and robustness against attack. We also looked into commercial solutions, but they were not flexible enough for some of our use cases, such as the just-in-time insertion of per-user watermarks for content distribution.
Given the dearth of existing open-source projects, we decided to start one ourselves. Initially, our main implementation is for the DTCWT (dual-tree complex wavelet transform) algorithm, but we hope to have implementations of many other methods that the user can choose from. We want to build a framework that is equally useful to researchers, developers, and content operators. Hopefully we're not oFFmark!
-
Steve Robertson
Google
Color grading HDR
Last time, we talked at a high level about the challenges facing the consumer HDR ecosystem. Now let's get specific.Here are our recommendations for exactly how to produce HDR content - workflow, lighting conditions, how to grade on a MacBook, rules of thumb, EOTFs, what looks good and bad, deliverable file formats - all backed up by testing and years of experience.
-
Lionel Bringuier
Videon
Taking the headache out of timed metadata for live video
The trend to merge video and time metadata is now mainstream, but inherent challenges still exist when it comes to merging multiple live video feeds with multiple sources of timed metadata in the media and entertainment (M&E) space. For example, captioning, digital rights management and synching multiple live streams with multiple cameras are a few among others. This creates barriers for live bettering, sports, and events to create better viewing experiences for their end-users.Why is it a challenge?
Today, many live video operators use HTTP-based OTT workflows sending video feeds from the camera to the Content Delivery Network (CDN). However, these workflows are subject to latency up to seven seconds, if not more. Also, this does not allow the live video operators to process the live streams and leverage data without encoding and transcoding them, raising the cost and overall complexity of the workflow.
In addition, workflows generally use SDI VITC timestamp versus UTC for each frame creating a discrepancy of synchronization across multiple metadata sources between different camera feeds across various locations, and degrades the overall viewing experience.
How did we solve this?
KLV, a SMPTE data encoding standard also used by the military to embed data in live video feeds, combines metadata with geospatial visualization, offering a new way to enhance the user experience enabling new use cases such as precise synchronization and timestamping of event highlights across multiple live video streams.
As practical use cases, a Precision Time Stamped wall clock embedded in live video streams can enable effective sport adjudication, betting, gamification….
Why choose this topic?
Timed metadata has always been a pain in the ass. Right?
Well, we solved that using a military-standard for good. Our mission is to positively impact society by simply moving media.
-
Tom Howe
Disney
Panama: Controlling the Video Traffic Flows
As our video catalog increases, so do our delivery costs, distribution complexity, incident mitigation challenges, etc. We developed a system Adaptive Traffic Routing (ATR) for addressing the various traffic routing needs. Codenamed Panama, the system is designed to balance the competing goals of1. High Quality of Service for end users (always the top priority)
2. Cost of CDN usage, particular when accounting for award tier pricing
3. Stabilize traffic to all CDNs, especially those running internally or which are particularly sensitive to traffic spikes
4. Minimize the cognitive load on our operators who are trying to accomplish all of the above.
Panama works by ingesting data from multiple sources, analyzing them, and creating a new set of rules for how to route users to CDN endpoints. Panama generates these rules every minute and feeds them to a service at the edge which assigns user requests to specific CDN endpoints. Since these rules are static and precompiled, they provide extremely high performance and scalability.
We strive to address all of the above concerns through our ingest data and our compilation algorithms.
This project has been highly focused on utilizing the data feedback to make the streaming experience more positive for everyone involved.
-
Vanessa Pyne
Daily
Fake it 'til you make it: Test patterns for fun and profit
Engineers tend to dread testing their software but I don't understand why because testing things, especially video on the internet is super fun! I'm fascinated with the fake media and test patterns used and aim to answer these questions in this talk:Who are the unsung heroes behind bars and tone? Assumed knowledge says Big Buck Bunny is used because it is open source, but is that true? What are the bizarre backstories of test media we rarely give a second thought, like, have you ever visited the website listed in top left of the Netflix test pattern or seen the WebDriver Torso Youtube account? What generates the fake camera streams in WebDriver flavors of chrome, gecko, and safari? This question in particular led me to dig through "found footage" in Geckodriver source code in a quest to try to find its slow-rainbow-fill-fade fake media stream that always makes me nostalgic for the FBI warning in the beginning of VHS tapes. Don't you want to know what other test media I found??
Inspired by these test pattern origin stories, we'll also cover tips and tricks to automate tests for your WebRTC/live stream app such as how to generate fake media streams via gstreamer, ffmpeg, v4l2loopback, and likely some unholy amalgamation thereof (because it's fun!). We'll talk about virtual capture devices at the browser (think `--use-fake-device-for-media-stream` & `--use-file-for-fake-video-capture` in Chromedriver) and (mac)OS level. Finally, we'll fake out the Web APIs `enumerateDevices` and `getUserMedia` to boot. After this talk, you'll never look at a test pattern the same way.
-
Vittorio Giovara
Vimeo
Video Encoding in Virtual Reality: Spatial, 360, and Ambisonics
If you spend too much time in VR and often wonder how you ended up turning off your headset at 3am *again*, this talk is for you!This presentation will provide an overview on VR technologies, focusing on the video encoding standards that emerged over the last few years, touching on the audio aspects with Ambisonics of course.
In particular we'll review the Spatial Metadata standard, showcasing its creation and adoption, followed by the various spatial formats available with pros and cons. We'll also review the new signaling options for Ambisonics, and the new FFmpeg API that can be used for it.
In order to be fully meta, this talk will be recoded and streamed from a VR environment.
-
Walker Griggs
Mux
Timestamp Troubles; How Mux handles unreliable system clocks in virtual environments
Video is hard, and reliable timestamps in increasingly virtual environments are even harder.We at Mux recently broke ground on a new live video experience, one that takes a website URL as input and outputs a livestream. We call it Web Inputs. As with any abstraction, Web Inputs hides quite a bit of complexity, so it wasn’t long before we ran up against our first “unexpected behavior”: our audio and video streams were out of sync.
This talk walks you through our experience triaging our timestamps troubles. It’s a narrative account that puts equal weight on the debugging process as the final implementation, and aims to leave the audience with new perspective on the triage process.
I hope you’ll learn from our mistakes, a bit about Libav audio device decoders, and hopefully a new pattern for web-to-video streaming.
-
Will Law
Akamai
Clever Monkeys Communicating Discreetly
This talk examines the simple elegance of the CMCD solution and its rapid rapid growth and deployment over the past year. Did you know that 25% of all video played back in the USA already sends CMCD data? Under the dual lenses of performance improvement and real-time data visibility, we'll examine why & how content distributors are using it and why CDNs want to receive it. We'll look at its use in novel research projects, open-source deployments,& player implementations, take a peek into use cases coming with the next version and examine how the upcoming CMSD spec completes the data flow.
-
Yuriy Reznik
Brightcove
Origins of DCT, Zigzag scan, I-, P-, B-frames, GOPs, and some other fun things in video
In this talk, I will focus on the history of image and video coding algorithms and trace the origins of many key design decisions that define the architectures of modern video codecs.Among things I will review will be:
* Origins of DCT (Fourier 1822(!), Ahmed 1972, Ahmed, Natarajan, & Rao 1974)
* Origins of Zigzag scan (G. Cantor, 1873, W. Lukosz 1962, A. Tesher 1973)
* Early transform-based codecs (A. Tescher 1973, Cox & Tescher 1976, Chen & Smith 1976, Kamangar & Rao, 1981, CLI TI system 1982, JPEG - 1991)
* Origins of predictive coding and P frames:
- DPCM (C. Cutler, 1950, DPCM NTSC codec – R. Brainard & A. Netravali, 1982)
- “Conditional replenishment” (CCITT H.120, 1984)
- “Frame difference coding” (CCITT H.120 v2, 1988)
- first “motion compensation-based” codecs (CCIR 721, 723, CCITT H.261, 1989-1991)
* Origins of B frames (T. Micke 1986, A. Puri, B. Haskell, et al. 1990s, MPEG-1 1993)
* Origins of the GOP concept (Nagata et al., 1990, MPEG-1 1993)
And while this material is usually highly technical and mathematical -- I’ll try to present it all in a fun and simple fashion, accessible to a broad audience.
-
Zoe Liu
Visionular
-
Thomas Davies
Visionular
Effective Per-Title Encoding For UGC Videos Using Machine Learning
Per-title encoding aims to achieve the best visual quality subject to a predefined maximum bitrate constraint for any arbitrary video content, which was first proposed by Netflix[1]. Ideally, the quality-bitrate convex hull of a given video should be obtained, by encoding the video with typical (bitrate, resolution) ladders and drawing the respective resulting quality-bitrate curves.For UGC (User Generated Content) videos, it is not practical to obtain the convex hull of every single video as the volume of UGC to process usually is extraordinarily huge. Meanwhile, individual UGC videos have to be processed sufficiently fast. Hence, how to derive a per-title-like approach for UGC has become a challenging but fairly attractive research topic. Quite a few state-of-the-art approaches have been proposed, in particular featured by the use of machine learning.
In this talk, we first outline the typical UGC per-tile issue as below:
1. A set of (resolution, bitrate) ladders are predefined;
2. A maximum bitrate that indicates the real-time bandwidth constraint is specified;
3. It is needed to decide (a) which resolution to be chosen, and (b) which CRF value should be configured for an encoder, in order to have the encoding bitrate satisfy the maximum bitrate constraint while achieving the best possible visual quality.
We have practiced the following approach for UGC per-title-like encoding:
Step 1: Extract spatial / temporal features for a given UGC video.
Step 2: Pre-train a machine-learning model to map the extracted spatial/temporal features from Step 1 to the triplet (bitrate, VMAF, CRF) for all predefined resolutions.
Step 3: For a given maximum bitrate constraint, based on the predefined (bitrate, resolution) ladders, exploit the machine learning model to predict the chosen resolution and the encoder CRF parameter.
Using the above approach, we may effectively resolve the following during multi-rendition adaptive encoding/transcoding:
(1) The VMAF-bitrate curve usually will level off when bitrate increases to a certain level. Sometimes to achieve a too high VMAF score, an unnecessary large bitrate has been used. The VMAF score actually can be lowered to a certain extent while a much lower bitrate may be produced.
(2) The predefined (bitrate,resolution) ladders may be over-defined for a certain video category, which means too many ladders have been pre-defined For certain UGC video categories, some (bitrate, resolution) ladders may be removed in advance, which will help significantly speed up the per-title processing.
Overall, we will demonstrate that machine learning based per-title can be efficiently and effectively applied to UGC videos. It not only achieves a more ideal visual experience while at lower bitrate, but also can be processed at fairly low computational complexity.
Reference:
[1] Anne Aaron, Zhi Li, Megha Manohara, Jan De Cock and David Ronca, "Per-Title Encode Optimization", Originally published at techblog.netflix.com on December 14, 2015.

Our Speakers
-
Alex Converse
Twitch
-
Alex Zambelli
Warner Bros. Discovery
-
Ali C. Begen
Ozyegin University / Comcast
-
Amy Rice
syd<video>
-
Christian Feldmann
Bitmovin
-
Christoph Guttandin
Media Codings
-
Cise Midoglu
Simula Research Laboratory
-
Dan Jenkins
Broadcaster VC
-
Dan Sparacio
Paramount
-
Derek Buitenhuis
Vimeo
-
Dylan Armajani
Akamai
-
Emil Santurio
Qualabs
-
Emmanuel Papirakis
Amazon
-
Guillaume Bichot
Broadpeak
-
Hadi Amirpour
Universität Klagenfurt
-
Hojatollah Yeganeh
SSIMWave
-
Jean-Baptiste Kempf
Videolan
-
Joey Parrish
-
Kieran Kunhya
Open Broadcast Systems
-
Leon Lyakovetsky
Podium
-
Lionel Bringuier
Videon
-
Marc Höppner
Akamai
-
Mario Guggenberger
Bitmovin
-
Mattias Buelens
THEO
-
Nicolas Levy
Qualabs
-
Nidhi Kulkarni
Mux
-
Peter Howard
Practical Applied Strategy
-
Peter Tseng
Eluvio
-
Qingyuan Liu
Eluvio
-
Steve Robertson
Google
-
Thomas Davies
Visionular
-
Tom Howe
Disney
-
Vanessa Pyne
Daily
-
Vittorio Giovara
Vimeo
-
Walker Griggs
Mux
-
Will Law
Akamai
-
Yuriy Reznik
Brightcove
-
Zoe Liu
Visionular

The Schedule
-
Breakfast
-
Opening Remarks
Matt McClure
-
Christoph Guttandin
Media Codings
-
Joey Parrish
-
Vanessa Pyne
Daily
-
Break
-
Dan Sparacio
Paramount
-
Zoe Liu
Visionular
Thomas Davies
Visionular
-
Tom Howe
Disney
-
Lunch, sponsored by:
-
Mattias Buelens
THEO
-
Guillaume Bichot
Broadpeak
-
Nicolas Levy
Qualabs
Emil Santurio
Qualabs
-
Hadi Amirpour
Universität Klagenfurt
-
Break
-
Marc Höppner
Akamai
-
Peter Howard
Practical Applied Strategy
-
Jean-Baptiste Kempf
Videolan
-
Mario Guggenberger
Bitmovin
-
Break
-
Ali C. Begen
Ozyegin University / Comcast
-
Peter Tseng
Eluvio
Qingyuan Liu
Eluvio
-
Hojatollah Yeganeh
SSIMWave
-
Will Law
Akamai
-
Breakfast
-
Alex Zambelli
Warner Bros. Discovery
-
Emmanuel Papirakis
Amazon
-
Vittorio Giovara
Vimeo
-
Leon Lyakovetsky
Podium
-
Break
-
Nidhi Kulkarni
Mux
-
Christian Feldmann
Bitmovin
-
Kieran Kunhya
Open Broadcast Systems
-
Cise Midoglu
Simula Research Laboratory
-
Lunch, sponsored by:
-
Lightning Talks
-
Amy Rice
syd<video>
-
Dylan Armajani
Akamai
-
Break
-
Yuriy Reznik
Brightcove
-
Dan Jenkins
Broadcaster VC
-
Alex Converse
Twitch
-
Break
-
Derek Buitenhuis
Vimeo
-
Lionel Bringuier
Videon
-
Walker Griggs
Mux
-
Steve Robertson
Google
-
Surprise
-
Closing Remarks
Matt McClure
-
Afterparty


About us
Demuxed is simply engineers talking about video technology. After years of chatting about video at the SF Video Technology Meetup, we decided it was time for an engineer-first event with quality technical talks about video. Our focus has traditionally been on content delivered over the web, but topics cover anything from encoding to playback and more!
Most of the organization and work behind the scenes is done by folks from Mux (Demuxed came first ☝️) but none of this would be possible without amazing people from the meetup.
Every year we get a group together that's kind enough to do things like schedule planning, help brainstorm cool swag, and, most importantly, argue heatedly over which talk submissions should make the final cut.


