I can give you a rough outline, as I… let me be charitable to myself, and call it: “understand”, it… :7
There is not much to it - the PiP designation pretty much gives away the whole schtick: You render a low resolution view of the whole per-eye FOV, including what’s in the middle, around your sightline (not necessarily in the middle of the bitmap, if the FOV reaches farther in any direction, relative to straight ahead, than in the others), a wide camera frustum - what Varjo calls the “context” view; And a high resolution one encompassing only that small part in the middle - a narrow frustum - maybe less than 100 by 100 degrees - the “focus” view ; Then composite these images together, so that the latter replaces its direct counterpart area in the former.
So you render the focus area twice, which is a bit wasteful in itself, but at least the one of them that you “throw away”, is the lower resolution one, and in that one, you could perfectly well mask that part out in your shaders - maybe incorporating it into the hidden area mask…
This gives you a degree of foveation in every cardinal direction; The high resolution “focus” imagery is surrounded by low resolution “context” imagery on all sides.
One do not need any special hardware (like the multiple displays in upper range Varjo headsets), to do this; It is just the rendering, so it could be mapped to any HMD. -In that case, for a one-display-panel-per-eye HMD (…which I’m pretty sure the StarVR, too, is…), the composite output bitmap does of course need to be a single resolution, which the two source images are resampled to, matching the HMD screens.
This is conceptually slightly different from slicing up the full per-eye view frustum into two or more tiling segments, whose view plane shares can be independently rotated perpendicular to its frustum partition, for less anisotropy (which manifests at the unnecessarily rendering-expensive stretching you get at the edges), can potentially each be rendered at its own resolution if one so want, and then reprojected (…and resampled) to a contiguous single view screen.
Setting up and rendering multiple views incurs a cost in itself, of course… NVidia actually supplied some functions for these sort of things with VRWorks, back when the GTX10x0 series of graphics cards launched, but hardly anybody ever used them, because they are of course proprietary, and will only work with NVidia GPUs. 