AI based Foveated rendering - new technique - I hope this gets used for the 8KX! :)

Matt_Goodman · January 18, 2020, 5:08pm

Not quite sure where to put this but I thought some people might be interested in it so check this out

Could be great for limiting bandwidth but I guess it would actually need decoding then on the headset so probably not useful in it’s current form but interesting still!

jojon · January 18, 2020, 6:38pm

I take it this requires per-scene data from pre-analysis against a ground truth?

Frc · January 18, 2020, 7:53pm

It’s seems a different approach then Nvidia DLSS. premptive full analyses is not required, but latency could be problematic if the refresh rate is not very high 90hz++ maybe you can read more to it then me link below :

3.2 Design GoalsThere are several goals that we would like to achieve with ourmethod. First, the DeepFovea network should be able to operating an online mode, i.e., it should be able to reconstruct the currentframe based only on the past frames

3.2.2 Performance Considerations.If the method is used for gazecontingent reconstruction, it has to exhibit under 50ms of latency foreach frame in order to be unnoticeable for human vision [Guenteret al.2012]. Moreover, for head-mounted displays, the method hasto run at HMD’s native refresh rate and high resolution to avoidmotion sickness and provide a comfortable experience. For manyexisting VR HMDs the minimum refresh rate is 90Hz.

Recurrence.In order to generate temporally stable video con-tent, the network needs to accumulate state through time. Moreover,a temporal network is able to super-resolve features through timeand can work with sparser input while achieving the same quality.However, our network has to be causal (i.e., cannot see the futurevideo stream) and should have a compact state to retain over timedue to high video resolution and performance constraints

NoamLoop · January 18, 2020, 7:59pm

Certainly, but question would be - how close does the training data have to be to the actually played game? So do they have to train this for each game? Or is it generic enough to cope with a wide spectrum without retraining?

Frc · January 18, 2020, 8:01pm

Not really, only need to be sync with the eyes tracking positioning system but the games engine must also support it and also be able take in account pupil positions provided by the headset

Frc · January 18, 2020, 8:08pm

if you read the except from the paper they say 50ms latency max and online performance ( serial image aggregation with no future data) so at 90hz → 90 images in 1000sec = 4,5 images?

jojon · January 18, 2020, 8:19pm

Rather a few magnitudes above my pay grade, but I do see your quoted bits, as well as mentions of temporal sampling and motion compensation.

I can imagine fast camera motion would become a serious problem; On the other hand, we do have that car speeding away… wonder what it would be like a few frames later. :7

What really impresses me, is if these guys can do all this real-time analysis and reconstruction stuff, and not have it be more computationally intensive than it would be to just render everything in full in the first place.

So many questions, and so hard to wrap one’s head around it all. At any rate, there seems to be plenty of interesting things to look forward to (…and quite a few consequences to be concerned with…)
I can not but imagine that while the algorithm is probably generic enough, the data would have to be very specific. I guess, following Frc’s post, in this case it is generated on the fly, from samples accumulated from previous frames, and so intrinsically relevant…

NoamLoop · January 18, 2020, 8:33pm

Looks like they found a reasonably representative training set that works for various content: “We train on videos sampled from a videodataset [Abu-El-Haija et al.2016] that contains a variety of naturalcontent such as people, animals, nature, text, etc. Each video hasresolution up to 640x480 and up to 150 frames. For each video, weprecompute the optical flow using the FlowNet2 network [Ilg et al.2017]. Next, the video is downsized to 128x128 to meet GPU memoryrestrictions. Lastly, the videos are sliced into 32-frame-long chunkswith an overlap of 8 frames. The total number of video sequencesin the training set is about 350,000”

Question now is of course how hand-picked (and close to the training data) the demonstration example was.
Running some online training at the gamer’s site would somehow contradict the advantage of the concept. The network that has to tell the output of the generator and the original apart needs original input. And for that the full high detail image has to be rendered.
As far as I understand it the idea is that during execution (for which the latency and cycle-time numbers were provided) only the foveated part is actually rendered, then fed into the network and the output then sent to the HMD.

But cool, yes! Faster GPUs, more clever algorithms, future VR games might look really great!

Matt_Goodman · January 18, 2020, 8:42pm

Lots of interesting info from these replies. Good stuff guys!

jojon · January 18, 2020, 8:57pm

Ah, you actually read and grokked the things, instead of just scrolled through the document and glanced at the pictures like… erm… some of us.

Kind of wondering how much data we’re talking about standing behind the network at runtime…

NoamLoop · January 18, 2020, 9:20pm

“The DeepFovea model has 3.1 million parameters and requires111 GFLOP with 2.2GB memory footprint per GPU for an inference pass”

jojon · January 18, 2020, 9:34pm

Aha, there we go… That’s… some. :7

NoamLoop · January 18, 2020, 10:12pm

Yepp, quad Tesla V100 for inference and 7x8 GPUs in a DGX-1 cluster for training is a little expensive for home users. But when compared to the supercomputers of yesteryear it’s actually a steal!

A few node shrinks might still be in the cards. And NVidia has afaik researched multi-chip GPU quite a while ago. Even if AMD seems to be the guys who put these into products nowadays.
Hopefully within a few years we all will have machines under our desks that can do stuff like this!

The marketing departments of the GPU manufacturers will be happy to have new reasons why every person needs a crank supercomputer at home

Just had a look at the top 500 supercomputer Wikipedia article:
In 2003 a 111 GFLops machine would still have been in the world wide top 500 supercomputer list. And a little before 1995 it would have been the fastest several billion dollar, house filling, several-megawatt-for-lunch-eating machine in the world. These times are not that far away in my memory. Getting old

jojon · January 19, 2020, 11:14am

Then again, a dedicated room with one of those old round Cray cabinets, with the running fluorinert cooler next to it, would be kind of spiffy… :9

system · March 19, 2020, 11:18am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.