Vizario H264RTSP for Unity3D

This package provides a H264 recorder in Unity3D sending data over RTSP.

To repeat what is the main purpose of this package:

Vizario H264RTSP is a lightweight library to enable RTSP streaming FROM iOS or Android devices (respectively standalone apps or Editor Win64 and OSX) to a local PC or remote server. The library allows you to record the content as rendered by a Unity game camera and push the encoded H264 bitstream to a server instance or serve it directly from the device within a (WIFI) environment.

FAQ:

This is a list of findings over the last 4 years on using the package. The list will be extended over time with new information.

  • Q: Does it work within VR setups?
  • A: Yes, it was tested to work within Oculus Quest 2/3 and Oculus Quest Pro.
  • Q: Is there any recovery in case of failure during streaming?
  • A: No. The way RTSP works does not implement any reconnect or restart functionality. Overall it is a relatively simplistic protocol, so you have to manage connection drops or failures yourself. The best the package can do is “not crash” but somewhat stop gracefully.
  • Q: Can it do audio as well?
  • A: No and it will not any time soon. The Unity audio setup is far too complex to implement this feature.
  • Q: Is it 100% fail safe to stream over a long period of time?
  • A: Probably not. People have been streaming for several hours, but if there is any disruption on the network layer or overflow of buffers in one or the other part (that is not actually part of the package), the streaming might just stop without further error message or notice.
  • Q: Can I stream any resolution at any frame rate or any bitrate I want?
  • A: Probably not. H264 for example is built for a maximum resolution of 4k (and that is the absolut max!). In order to go beyond that, you would need to use H265.
  • Q: Can I stream more than one camera at a time?
  • A: Yes, but there are multiple catches. All the encoding is done in hardware to make it as fast as possible. This means you need a physical hardware encoder unit, and the question is, how many of those do you have (probably 4 at max). There is no fallback to software, so if you run out of encoder units, creating another one will fail.
  • Q: What makes streaming at high frame rates and large resolutions such a problem?
  • A: The way this (and basically any other) plugin works is retrieving the rendered image from the GPU, pass it through the Marshalling barrier from C# to C++ and then to the encoder and the pusher/server. GPU readback has always been an issue (it’s now a little better because of deep learning requiring wider bus widths), but in general there is a bottleneck somewhere. Either it is the width of the GPU readback bus, or it is the general memory bus – the road is simply not wide enough, so to day. Just a simple math example: a single stream at 1920x1080x4x60 bytes means 500MB per second!
  • Q: It’s stuttering if I do high bitrates.
  • A: Well, what is high? Over WIFI 20MBit will likely max out your standard 2.4GHz network already, and on 5Ghz it will also become crowded. It’s just a lot of data, that’s it…
  • Q: Why is the bitrate for live streaming higher than for a regular offline video created by e.g. Handbrake?
  • A: Because we do not know the video data in advance and there is no forward/backward encoding. There is only forward encoding and the encoder has to threat the data as unseen if it comes in. This means you need higher bitrate and closer keyframe intervals.
  • Q: Is there a test package available before I buy it?
  • A: No, but if you send me a nice email with some info about your use case, I can probably make something work.
  • Q: There is something not working as I expected. What should I do?
  • A: Send me an email! There are always cases which might work one or the other way, but it heavily depends on your setup.
  • Q: How are the internals working and where are the bottlenecks, particularly concerning a VR case?
  • A: If you are working in Unity3D, you are working in a 3D game engine. It is not a 2D player app, which accesses the “surface” of your device directly. Therefore, there are copy operations involved that essentially bring the decoded image content into a texture surface, which is then transferred to the GPU. To quote some of the discussions I had previously (thx to Kyle Johnson for testing this out):

I’ve gotten streaming to work on the Quest just fine.  It wasn’t hard at all.  I tried all 3 streaming modes, as well as both render texture and camerainput.  Everything seems to work fine.  On the camerainput modes, you do have to set the InputFormat to COLORSPACE.RGBA32, but otherwise, it’s fine (I had to do that on my android phone as well).  Vulkan or GLES3 work fine.

720p30 streaming is doable.  It definitely hits the frame rate a bit on any real scene (the spinning cube stays at 72fps, but I tried a few real scenes and it couldn’t hold.  Performance metrics are really hard on the Quest 2, but I’d say you take about a 50% hit on performance (half of what you could previously render at 72fps)  when streaming at 720p30. 1080p30 is too much, even for a basic scene to hold at 72fps.  480p is better.

Generally, performance seems on par with Unity’s WebRTC package, which also works on Quest with similar limitations.  The latency of the latter is better, but the setup is quite painful by contrast because you have to deal with the signaling.  Also, the ability of your package to run as a server and just watch on any device with VLC player is pretty cool.

NOTE FOR iOS15+

There is an issue first discovered with iOS 17 not being tolerant to code signature errors. If you encounter any problems with code signing when installing on an iOS device, try replacing the frameworks in the Assets/Vendors/Vizario/Plugins/iOS folder with the files in this zip file. If it still not works, please send me an email!
ZIP File