Deep Learning Meets Photography: Simulating Bokeh Effects in Selfies Using Mask R-CNN
Share this article
The Selfie Bokeh Challenge
Smartphone front cameras sacrifice optical quality for compactness, making natural background blur—a hallmark of professional photography—nearly impossible to achieve optically. While apps like Google Photos simulate this computationally, developer Rahul Ravikumar explored building an open-source alternative using deep learning fundamentals.
The Technical Pipeline
Ravikumar's approach combines computer vision and deep learning in three stages:
- Subject Segmentation: Using PyTorch's pretrained Mask R-CNN model (trained on COCO dataset), the pipeline isolates the human subject from the background. The model generates a segmentation mask identifying pixels belonging to the person:
Segmentation mask generated by Mask R-CNN
- Background Processing: The isolated background undergoes a custom convolution process to simulate bokeh. Ravikumar implemented a kernel combining Gaussian blur with a triangular amplitude mask to create organic light dispersion effects:
triangle = np.array([...]) # Triangular pattern
kernel = cv2.getGaussianKernel(11, 5.)
kernel = kernel * kernel.transpose() * triangle
kernel = kernel / np.sum(kernel)
- Image Reconstruction: The processed background is merged with the untouched foreground subject. The quality hinges entirely on the segmentation accuracy—a limitation Ravikumar acknowledges works best with standard portrait selfies.
Isolated foreground before merging
Why This Approach Matters
Unlike simple Gaussian blur filters, this method physically models light behavior through:
- Channel-specific intensity boosting (r = np.where(r > 0.9, r * 2, r))
- Constrained convolution to prevent overflow (np.where(fr > 1., 1., fr))
- Precise foreground/background recombination using segmentation masks
The technique demonstrates how accessible deep learning models (like Mask R-CNN) can be repurposed for creative applications beyond their original training objectives.
Real-World Results
Original unprocessed selfie
Final image with simulated bokeh
While not production-ready, the project showcases the untapped potential of combining classical computer vision techniques with modern deep learning. As segmentation models improve, such pipelines could become standard in mobile photography—no dual lenses required.
Source: Implementation adapted from Rahul Ravikumar's Bokehlicious Selfies project.