As you noticed, the MP4 format can be difficult to use in such situations. I suspect that the related blog post does not go into the details of the “fix” because it can be quite complicated. In addition to writing the missing size field in the mdat field, you need to create the ftyp and moov fields. If you really need a complete MP4 solution, ISO 14496-12 and ISO 14496-14 will tell you more than you ever wanted to know about how to create these data structures.
However, you may find that a much more elegant solution is to use a format that is really suitable for real-time processing. In other words, on the Android side, convert the video stream to real-time and send it to the server. On the server side, you have great flexibility for processing video: you can transfer all the videos back to MP4, you can cut into cubes, do 10-second pieces, or something else. The open source Sipdroid project contains some code that demonstrates remixing live video in RTP. (You may prefer a reliable transmission format - RTP over TCP or something else - the principle is the same.)
source share