Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
How to Implement a Convolutional Layer from Scratch?
Published:
It is a tutorial about implementing a convolutional layer from scratch with NumPy, with forward and backward propagation.
Introduction
Convolutional Neural Networks (CNN) have been widely adopted in various Computer Vision tasks and have achieved remarkable performance. With the development in deep learning frameworks such as PyTorch and TensorFlow, researchers can quickly build CNN models and start experimenting in a short amount of time. While these frameworks undoubtedly accelerate research progress and allow us to avoid reproducing the wheel, it is still crucial for us to understand the details of a convolutional layer to foster further innovation. Motivated by this, in this technical blog, we present a implementation of a convolutional layer from scratch.
Implementation
Forward Pass
Implementation of the forward pass for a convolutional layer. The input consists of N data points, each with C channels, height H and width W. We convolve each input with F different filters, where each filter spans all C channels and has height HH and width HH.
Input:
- x: Input data of shape (N, C, H, W)
- w: Filter weights of shape (F, C, HH, WW)
- b: Biases, of shape (F,)
- conv_param: A dictionary with the following keys: ‘stride’: The number of pixels between adjacent receptive fields in the horizontal and vertical directions. ‘pad’: The number of pixels that will be used to zero-pad the input.
Output:
- out: Output data, of shape (N, F, H’, W’) where H’ and W’ are given by: H’ = 1 + (H + 2 * pad - HH) / stride and W’ = 1 + (W + 2 * pad - WW) / stride
- cache: (x, w, b, conv_param)
Python Code
def conv_forward(x, w, b, conv_param):
N, C, H, W = x.shape
F, C, HH, WW = w.shape
pad = conv_param['pad']
stride = conv_param['stride']
H_ = 1 + math.floor((H + 2 * pad - HH) / stride)
W_ = 1 + math.floor((W + 2 * pad - WW) / stride)
out = np.zeros((N, F, H_, W_))
x_padded = np.pad(x, [(0,), (0,), (pad,), (pad,)])
for n in range(N):
for f in range(F):
for i in range(H_):
for j in range(W_):
out[n,f,i,j] = np.sum(
x_padded[n,:,i*stride:i*stride+HH,j*stride:j*stride+ WW]
* w[f]) + b[f]
cache = (x, w, b, conv_param)
return out, cache
Backward Pass
Implementation of the backward pass for a convolutional layer. What should be noted is that the gradient computation is essentially a convolution, but with the rotated kernel. The notations follow ones in the forward pass.
Input:
- dout: Upstream derivatives.
- cache: A tuple of (x, w, b, conv_param) as in conv_forward_naive
Output:
- dx: Gradient with respect to x
- dw: Gradient with respect to w
- db: Gradient with respect to b
Python Code
def conv_backward(dout, cache):
dx, dw, db = None, None, None
db = np.sum(dout, axis=(0,2,3))
x, w, b, conv_param = cache
N, C, H, W = x.shape
F, C, HH, WW = w.
stride, pad = conv_param['stride'], conv_param['pad']
H_ = 1 + math.floor((H + 2 * pad - HH) / stride)
W_ = 1 + math.floor((W + 2 * pad - WW) / stride)
x_padded = np.pad(x, [(0,), (0,), (pad,), (pad,)])
dx = np.zeros_like(x)
dw = np.zeros_like(w)
for f in range(F):
for c in range(C):
for i in range(HH):
for j in range(WW):
dw[f,c,i,j] = np.sum(
x_padded[:,c,i:i+H_*stride:stride,j:j + W_*stride:stride]
* dout[:,f,:,:])
w_rot = np.rot90(w, 2, axes=(2,3))
w_rot = np.moveaxis(w_rot, 0, 1)
_, _, h_out, w_out = dout.shape
H_out = h_out + (stride - 1) * (h_out - 1)
W_out = w_out + (stride - 1) * (w_out - 1)
new_out = np.zeros((N, F, H_out, W_out))
new_out[:,:,0::stride,0::stride] = dout
new_out = np.pad(new_out, [(0,), (0,), (HH-1,), (WW-1,)])
dx = conv_forward(new_out, w_rot, b = np.zeros(C,),
conv_param={'stride': 1, 'pad': 0})[0][:,:,pad:-pad,pad:-pad])
return dx, dw, db
Limitations
We have presented a simple yet concise Python code for implementing a convolutional layer from scratch. However, our code’s direct looping approach over the input data and filters can be computationally expensive, especially for larger inputs and filters. To address the computational inefficiency, several optimization techniques can be applied: Vectorization, Convolutional Fast Fourier Transform, Parallelization, and Winograd’s Algorithm. Nevertheless, the primary aim of our work is to unravel the mysteries in convolutional neural networks, and serves as a good starting point for learning about CNNs and their inner workings.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2
publications
How to Implement a Convolutional Layer from Scratch?
It is a tutorial about implementing a convolutional layer from scratch with NumPy, with forward and backward propagation.
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.