I am trying to implement Epshtein's paper(Detecting text in natural scenes with stroke width transform(2010)) on text detection in natural images. First step is edge detection.
I am getting some extra edges inside my text. How should I remove those?
Original image:
My edge detection:In the example, you can see extra edges in the text 'WHY HURRY'
I have tried these steps in Matlab:
% contrast enhancement
I_adjust = imadjust(I);
% dilation & erosion
se = strel(ones(3,3));
I_dilate = imdilate(I_adjust, se);
I_final = imerode(I_dilate, se);
% gaussian smoothing
h_mask = fspecial('gaussian');
I_final = imfilter(I_final,h_mask);
figure; imshow(I_final);
BW_canny = edge(I_final,'canny');
figure; imshow(BW_canny);
Problem #2:
As per belisarius's suggestion, I found that mean-shift filter works quite well for text region segmentation. Now I am facing another problem in the implementation of Stroke Width transform(look开发者_开发技巧 at Epshtein's paper).
Stroke Width works well with chars like 'H''Y' even for 'S' because the corresponding edges are usually at constant distance if we proceed in the direction of gradient.
Problem comes in chars like 'W'. For one portion of left edge of 1st upstroke we get the right edge of 2nd upstoke as its correspoding edge. Whereas for another portion, we get right edge of 1st upstroke. This introduces significant variance in the stroke width of the region of 'W' leading to terming this as non-text region according to paper.
Can anyone suggest any solution?
Use a Mean Shift Filter
before the Edge Detection. Example in Mathematica:
i = Import["http://img839.imageshack.us/img839/28/whyhurry.jpg"];
iM = MeanShiftFilter[i, 2, .15, MaxIterations -> 10]
EdgeDetect[iM]
Outputs:
Take a look at the Matlab documentation for edge
and the Wikipedia article on the Canny algorithm.
You can call edge(I, 'canny', thresh, sigma)
for more control. Play around with the low and high edge thresholds. I'd try lowering the high threshold first: since the interior edges are not connected to the letter edges, the gradient magnitude must exceed the high threshold inside the letters.
You can also increase sigma
to blur the image more before edge detection. (Your Gaussian blurring is redundant, because edge
blurs the image for you.)
精彩评论