This repository is supposed to be a successful implementation of a Neural Radiance Field pipeline. The logic involved in the research papers has been implemented to the best of my ability, the basic nerf model can be trained after pre-processing the synthetic data. The model takes as input, 5 Dimensional values and produces 3D-rendered images by volume rendering. This repository supports training NeRF models on nerf synthetic datasets.
NeRF is a method for generating novel views of complex 3D scenes or 3D objects by training a Feed Forward Network (MLP) to optimize over a volumetric scene function. The implementation involves the use of a MLP to output RGB and density values by processing the 5 dimensional input.
-
Structured Input. The 5D input is not just available to be fed into the network, the data that is ultimately fed into the network is not what is available i.e. the raw pixel's RGB value and a transformation matrix. The transformation matrix conveys the orienation of the camera that shot that particular image, using this transformation matrix we can generate the theta and phi values required to feed into our MLP.
-
Multi Layer Perceptron that takes in the RGB, theta and phi values for every pixel in every image in the dataset as input and outputs RGB and volume opacity values for the relevant coordinates in the 3D space.
-
Volume rendering to synthesize images by integrating through a learned 3D space. This is done by projecting rays from every pixel in every image. For each of these rays we sample points along those rays where we have RGA and volume opacity values generated using the MLP. Integrating the color and density values along these rays gives us the predicted RGB value for that particular pixel. Once we have the predicted values for the entire batch, the mean squared loss is taken to optimize the MLP.
-
Clone the repository:
git clone https://github.com/R2D2-08/NeRF.git cd NeRF
-
Mildenhall, Ben, et al. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. arXiv, 3 Aug. 2020
-
Müller, Thomas, et al. Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. ACM Transactions on Graphics, vol. 41, no. 4, July 2022, pp. 1–15