Deep Learning Meets Photography: Simulating Bokeh Effects in Selfies Using Mask R-CNN

Front-facing smartphone cameras often struggle to produce the pleasing background blur (bokeh) found in professional portraits. Discover how one developer leveraged PyTorch's Mask R-CNN model and OpenCV to create an AI-powered pipeline that adds realistic bokeh to ordinary selfies.

The Selfie Bokeh Challenge

Smartphone front cameras sacrifice optical quality for compactness, making natural background blur—a hallmark of professional photography—nearly impossible to achieve optically. While apps like Google Photos simulate this computationally, developer Rahul Ravikumar explored building an open-source alternative using deep learning fundamentals.

The Technical Pipeline

Ravikumar's approach combines computer vision and deep learning in three stages:

Subject Segmentation: Using PyTorch's pretrained Mask R-CNN model (trained on COCO dataset), the pipeline isolates the human subject from the background. The model generates a segmentation mask identifying pixels belonging to the person:

Segmentation mask generated by Mask R-CNN

Background Processing: The isolated background undergoes a custom convolution process to simulate bokeh. Ravikumar implemented a kernel combining Gaussian blur with a triangular amplitude mask to create organic light dispersion effects:

triangle = np.array([...]) # Triangular pattern
kernel = cv2.getGaussianKernel(11, 5.)
kernel = kernel * kernel.transpose() * triangle
kernel = kernel / np.sum(kernel)

Image Reconstruction: The processed background is merged with the untouched foreground subject. The quality hinges entirely on the segmentation accuracy—a limitation Ravikumar acknowledges works best with standard portrait selfies.

Isolated foreground before merging

Why This Approach Matters

Unlike simple Gaussian blur filters, this method physically models light behavior through:

Channel-specific intensity boosting (r = np.where(r > 0.9, r * 2, r))
Constrained convolution to prevent overflow (np.where(fr > 1., 1., fr))
Precise foreground/background recombination using segmentation masks

The technique demonstrates how accessible deep learning models (like Mask R-CNN) can be repurposed for creative applications beyond their original training objectives.

Real-World Results

Original unprocessed selfie

Final image with simulated bokeh

While not production-ready, the project showcases the untapped potential of combining classical computer vision techniques with modern deep learning. As segmentation models improve, such pipelines could become standard in mobile photography—no dual lenses required.

Source: Implementation adapted from Rahul Ravikumar's Bokehlicious Selfies project.