We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this questionYou'd take images and mark specific points (for example, mark the region around the eyes, nose, mouth etc of people) and then transform them into the points marked into another image. Something like:
transform(original_image, marked_points_in_the_original, marked_points_in_the_reference)
I can't seem to find an algorithm describing it, nor can I find any libraries with it. I'm willing to do it myself too, as long as I can find good/easy to follow material on it. I know it's possible though since I've seen some incomplete (don't really explain how to do it) .pdfs on google with it.
Here's an example of the marked points and the transformation, since you asked for clarification. Though this one isn't using 2 people as I said earlier.
Edit: I managed to get the im.transform
method working, but the argument is a list of开发者_JS百科 ((box_x, box_y, box_width, box_height), (x0, y0, x1, y1, x2, y2, x3, y3))
, with the first point being NW, the second SW, the third NE and the fourth SE. (0, 0) is the leftmost upper part of the screen as far as I could tell. If I did everything right, then this method doesn't really do what I need.
Sample code given by Blender doesn't work for me. Also, the PIL documentation for im.transform
is ambiguous. So I dig into the PIL source code and finally figure out how to use the interface. Here's my complete usage:
import numpy as np
from PIL import Image
def quad_as_rect(quad):
if quad[0] != quad[2]: return False
if quad[1] != quad[7]: return False
if quad[4] != quad[6]: return False
if quad[3] != quad[5]: return False
return True
def quad_to_rect(quad):
assert(len(quad) == 8)
assert(quad_as_rect(quad))
return (quad[0], quad[1], quad[4], quad[3])
def rect_to_quad(rect):
assert(len(rect) == 4)
return (rect[0], rect[1], rect[0], rect[3], rect[2], rect[3], rect[2], rect[1])
def shape_to_rect(shape):
assert(len(shape) == 2)
return (0, 0, shape[0], shape[1])
def griddify(rect, w_div, h_div):
w = rect[2] - rect[0]
h = rect[3] - rect[1]
x_step = w / float(w_div)
y_step = h / float(h_div)
y = rect[1]
grid_vertex_matrix = []
for _ in range(h_div + 1):
grid_vertex_matrix.append([])
x = rect[0]
for _ in range(w_div + 1):
grid_vertex_matrix[-1].append([int(x), int(y)])
x += x_step
y += y_step
grid = np.array(grid_vertex_matrix)
return grid
def distort_grid(org_grid, max_shift):
new_grid = np.copy(org_grid)
x_min = np.min(new_grid[:, :, 0])
y_min = np.min(new_grid[:, :, 1])
x_max = np.max(new_grid[:, :, 0])
y_max = np.max(new_grid[:, :, 1])
new_grid += np.random.randint(- max_shift, max_shift + 1, new_grid.shape)
new_grid[:, :, 0] = np.maximum(x_min, new_grid[:, :, 0])
new_grid[:, :, 1] = np.maximum(y_min, new_grid[:, :, 1])
new_grid[:, :, 0] = np.minimum(x_max, new_grid[:, :, 0])
new_grid[:, :, 1] = np.minimum(y_max, new_grid[:, :, 1])
return new_grid
def grid_to_mesh(src_grid, dst_grid):
assert(src_grid.shape == dst_grid.shape)
mesh = []
for i in range(src_grid.shape[0] - 1):
for j in range(src_grid.shape[1] - 1):
src_quad = [src_grid[i , j , 0], src_grid[i , j , 1],
src_grid[i + 1, j , 0], src_grid[i + 1, j , 1],
src_grid[i + 1, j + 1, 0], src_grid[i + 1, j + 1, 1],
src_grid[i , j + 1, 0], src_grid[i , j + 1, 1]]
dst_quad = [dst_grid[i , j , 0], dst_grid[i , j , 1],
dst_grid[i + 1, j , 0], dst_grid[i + 1, j , 1],
dst_grid[i + 1, j + 1, 0], dst_grid[i + 1, j + 1, 1],
dst_grid[i , j + 1, 0], dst_grid[i , j + 1, 1]]
dst_rect = quad_to_rect(dst_quad)
mesh.append([dst_rect, src_quad])
return mesh
im = Image.open('./old_driver/data/train/c0/img_292.jpg')
dst_grid = griddify(shape_to_rect(im.size), 4, 4)
src_grid = distort_grid(dst_grid, 50)
mesh = grid_to_mesh(src_grid, dst_grid)
im = im.transform(im.size, Image.MESH, mesh)
im.show()
Before:
After:I suggest executing above code in iPython then print out mesh
to understand what kind of input is needed for im.transform
. For me the output is:
In [1]: mesh
Out[1]:
[[(0, 0, 160, 120), [0, 29, 29, 102, 186, 120, 146, 0]],
[(160, 0, 320, 120), [146, 0, 186, 120, 327, 127, 298, 48]],
[(320, 0, 480, 120), [298, 48, 327, 127, 463, 77, 492, 26]],
[(480, 0, 640, 120), [492, 26, 463, 77, 640, 80, 605, 0]],
[(0, 120, 160, 240), [29, 102, 9, 241, 162, 245, 186, 120]],
[(160, 120, 320, 240), [186, 120, 162, 245, 339, 214, 327, 127]],
[(320, 120, 480, 240), [327, 127, 339, 214, 513, 284, 463, 77]],
[(480, 120, 640, 240), [463, 77, 513, 284, 607, 194, 640, 80]],
[(0, 240, 160, 360), [9, 241, 27, 364, 202, 365, 162, 245]],
[(160, 240, 320, 360), [162, 245, 202, 365, 363, 315, 339, 214]],
[(320, 240, 480, 360), [339, 214, 363, 315, 453, 373, 513, 284]],
[(480, 240, 640, 360), [513, 284, 453, 373, 640, 319, 607, 194]],
[(0, 360, 160, 480), [27, 364, 33, 478, 133, 480, 202, 365]],
[(160, 360, 320, 480), [202, 365, 133, 480, 275, 480, 363, 315]],
[(320, 360, 480, 480), [363, 315, 275, 480, 434, 469, 453, 373]],
[(480, 360, 640, 480), [453, 373, 434, 469, 640, 462, 640, 319]]]
On a similar note, you could use ImageMagick's Python API to do Shepards's Distortion.
Yep, there is. It's a bit low-level, but PIL (the Python Imaging Library) has a function to do this sort of transformation. I've never really had it work for me (as my problem was a bit simpler), but you can play with it.
Here's a good resource for the PIL transformations (you'd want to look at MESH): http://effbot.org/tag/PIL.Image.Image.transform.
From the documentation:
Similar to QUAD, but data is a list of target rectangles and corresponding source quadrilaterals.
im.transform(size, MESH, data)
Data is a tuple of rectangles:
data = [((a, b, c, d), (e, f, g, h)),
((i, j, k, l), (m, n, o, p))]
It transforms the first rectangle into the second.
I've got a solution using OpenCV by triangulating the transformation points:
It does not look perfect, but with more points on the source/target image the results get better.
Code
Here is the code I used for the transformation, at the bottom you can see how to call your transform
function.
#!/bin/env python3
import cv2
import numpy as np
def get_triangulation_indices(points):
"""Get indices triples for every triangle
"""
# Bounding rectangle
bounding_rect = (*points.min(axis=0), *points.max(axis=0))
# Triangulate all points
subdiv = cv2.Subdiv2D(bounding_rect)
subdiv.insert(list(points))
# Iterate over all triangles
for x1, y1, x2, y2, x3, y3 in subdiv.getTriangleList():
# Get index of all points
yield [(points==point).all(axis=1).nonzero()[0][0] for point in [(x1,y1), (x2,y2), (x3,y3)]]
def crop_to_triangle(img, triangle):
"""Crop image to triangle
"""
# Get bounding rectangle
bounding_rect = cv2.boundingRect(triangle)
# Crop image to bounding box
img_cropped = img[bounding_rect[1]:bounding_rect[1] + bounding_rect[3],
bounding_rect[0]:bounding_rect[0] + bounding_rect[2]]
# Move triangle to coordinates in cropped image
triangle_cropped = [(point[0]-bounding_rect[0], point[1]-bounding_rect[1]) for point in triangle]
return triangle_cropped, img_cropped
def transform(src_img, src_points, dst_img, dst_points):
"""Transforms source image to target image, overwriting the target image.
"""
for indices in get_triangulation_indices(src_points):
# Get triangles from indices
src_triangle = src_points[indices]
dst_triangle = dst_points[indices]
# Crop to triangle, to make calculations more efficient
src_triangle_cropped, src_img_cropped = crop_to_triangle(src_img, src_triangle)
dst_triangle_cropped, dst_img_cropped = crop_to_triangle(dst_img, dst_triangle)
# Calculate transfrom to warp from old image to new
transform = cv2.getAffineTransform(np.float32(src_triangle_cropped), np.float32(dst_triangle_cropped))
# Warp image
dst_img_warped = cv2.warpAffine(src_img_cropped, transform, (dst_img_cropped.shape[1], dst_img_cropped.shape[0]), None, flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REFLECT_101 )
# Create mask for the triangle we want to transform
mask = np.zeros(dst_img_cropped.shape, dtype = np.uint8)
cv2.fillConvexPoly(mask, np.int32(dst_triangle_cropped), (1.0, 1.0, 1.0), 16, 0);
# Delete all existing pixels at given mask
dst_img_cropped*=1-mask
# Add new pixels to masked area
dst_img_cropped+=dst_img_warped*mask
if __name__ == "__main__":
# Inputs
src_img = cv2.imread("woman.jpg")
dst_img = cv2.imread("cheetah.jpg")
src_points = np.array([(40, 27), (38, 65), (47, 115), (66, 147), (107, 166), (147, 150), (172, 118), (177, 75), (173, 26), (63, 19), (89, 30), (128, 34), (152, 27), (75, 46), (142, 46), (109, 48), (95, 96), (107, 91), (120, 97), (84, 123), (106, 117), (132, 121), (97, 137), (107, 139), (120, 135)])
dst_points = np.array([(2, 16), (0, 60), (2, 143), (47, 181), (121, 178), (208, 181), (244, 133), (241, 87), (241, 18), (41, 15), (73, 20), (174, 16), (218, 16), (56, 23), (191, 23), (120, 48), (94, 128), (120, 122), (150, 124), (83, 174), (122, 164), (159, 173), (110, 174), (121, 174), (137, 175)])
# Apply transformation
transform(src_img, src_points, dst_img, dst_points)
# Show result
cv2.imshow("Transformed", dst_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
The src_points
and dst_points
in the main-Function were hardcoded and correspond to the landmarks, that are marked green in the images above. The code was partly inspired by this online article, but the code was clean up a bit. After answering this question, I've also created my own FaceChanger github repo, with an interactive python app, using the same functionality as described in this answer.
Requirements
- Numpy:
pip3 install numpy
- OpenCV:
pip3 install opencv-python
How it works
Triangulation
At first we need to triangulate the image, which will transform the points from the two top images to the triangles at the bottom. We need triangles instead of points because this allows us to transform the individual triangles seperately, which will make our live easier down the road. The triangulation is done using Delaunay Triangluation with OpenCV. The points of the first and the second image do not necessarily have to result in the same triangulation, therefore the get_triangulation_indices
function returns the indices of all corners for each triangle. Using these indices we can map every source triangle to one destination triangle
Warp Triangles
The triangles are warped using OpenCV's warpAffine
-method. The issue with this method is that it warps an entire image and not just one triangle, so we have to do some more work there to only warp triangles.
Cut out triangle
First we cut out only the part of the source and target image that contains the source or target triangle. This would theoretically not be necessary, but it's much faster that way because then we don't have to do the warp on the entire image everytime. This is done using the crop_to_triangle
-method.
Transform image
Then we see how we have to distort the image to get from the source triangle to the target triangle, with cv2.getAffineTransform
. This will give us a transformation matrix that we can use with cv2.warpAffine
to warp our image to the destination proportions.
Mask to triangle
Now we have the issue that the warp transform did not just transform our triangles, but our entire src_img_cropped
. So now we have to only past the pixels belonging to our triangle to the target image. We can use cv2.fillConvexPoly
to create a mask of our target triangle and use this to delete all pixels from the destination image that are within the triangle that we want to paste, in order to add the warped triangle to this spot that we just emptied. This is done using Numpy array manipulations.
Conclusion
This is a fairly simple method to achieve the task. It does however result in some unnatural looking straight edges sometimes, and might therefore not be perfect for all uses. The quality of the result is however increased if you add more points to your source and target images. You also need to add the corners of the source and target images to your points if you want that the entire image is copied, otherwise the destination image will just be overwritten with parts of the source image, which I see as a feature. This can also be combined with face detection to create a face-swap effect, for which I personally use dlib, which gives great results.
精彩评论