Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions examples/Facemesh-keypoints/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>ml5.js Facemesh p5 Webcam Example</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/1.6.0/p5.js"></script>
<script src="../../dist/ml5.js"></script>
</head>
<body>
<script src="sketch.js"></script>
</body>
</html>
47 changes: 47 additions & 0 deletions examples/Facemesh-keypoints/sketch.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
// Copyright (c) 2023 ml5
//
// This software is released under the MIT License.
// https://opensource.org/licenses/MIT

let facemesh;
let video;
let faces = [];
let options = { maxFaces: 1, refineLandmarks: false, flipHorizontal: false };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@B2xx @UpKindLikeWater I am wondering whether having an options object in the examples is worth having or not. flipHorizontal seems useful, but perhaps something that ML5 should better do uniformly across all models (e.g. with the existing utils class)? @shiffman

Is there a drawback to having the default for maxFaces at a higher value?
When would one want to enable refineLandmarks? (can we do it always, or never?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"facemesh" might be an exception b/c it's a bit more complex of a model than bodypose or handpose, but I would lean towards not including the options in the beginner, first example and set defaults. I would also guess that most starter example and demos would be for one face only, but maybe allowing more than one is best by default. Does it affect performance?

I was curious about flipHorizontal, I realize it doesn't actually change the video itself it probably just mirrors all the points? This goes back to questions around the flipVideo() utility function we have in the current ml5.js. Do we want to keep it? Some options are:

  • Build a utility in ml5.js that mirrors video
  • Clear examples that show how to mirror the video when drawing.... and maybe all the models would include an option that mirrors the landmarks like this one does?

@ziyuan-linn what do you think in terms of bodypose and handpose?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used to take the approach of mirroring the video while drawing - but in models that don't have a flipHorizontal property, that leaves one with either some more convoluted code that does the arithmetic on the result - or running the whole draw function mirrored, which then quickly runs into issues when displaying text 😅.

I am not sure how feasible this is to implement, but my preference would be to have a class that transparently flips the image of the underlying video element (so that you can use it even in continuous mode), e.g.:

let video;
let flippedVideo;

function setup() {
  video = createCapture(VIDEO);
  // ..
  flippedVideo = flipVideo(video);
  facemesh.detectStart(flippedVideo, gotFaces);
}

function draw() {
  image(flippedVideo, ...);
}

Or, a flipHorizontal() method for videos in p5.js? 🤔

Copy link
Contributor Author

@B2xx B2xx Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gohai and @shiffman, here's the documentation of tensorflow.js face mesh models and it talks about all the options that we could use.

I love the idea of giving a general flip video option for ml5.js. however, I'm not that sure about the feasibility of it. Flipping the image of the underlying video element definitely works better for me. Maybe we could use copy() and scale() to realize this?

According to my tests, there is no drawback to having maxFaces to a higher value.

As for the refineLandmarks option, the documentation says

refineLandmarks: Defaults to false. If set to true, refines the landmark coordinates around the eyes and lips, and outputs additional landmarks around the irises.

Since the users will set refineLandmarks to true only when they need the irises, which I think is not that common. Maybe we could add a comment for this option or let it always to false for first example/ beginners?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi all! In terms of bodypose and handpose, I agree with @gohai that a function like ml5.flipVideo would work better. Trying to flip the video using the p5 functions has caused a lot of problems for me personally, and it would be nice to have a way to flip the webcam video with one line of code and not interfere with other things.

As far as implementing a function like ml5.flipVideo(video) I am imaging taking these steps:

  1. Get the original video from the p5 video object (or directly if the user inputs an HTML video)
  2. Create an HTML canvas and draw the video on it with the x-axis flipped.
  3. Convert the HTML canvas into a media stream using captureStream()
  4. Return the media steam ad a video or p5 video depending on the input type.

I am not 100% sure whether this would work, but I am willing to try to implement this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with all thoughts! @ziyuan-linn I think a good next step would be for you to open a new issue duplicating your comment here and your proposal. We can then also tag in and check with p5.js team to make sure we aren't doing redundant work (in case they have something like this planned.) Or maybe they would prefer we add something to p5 directly! (I would ask here, but I think it may be hard to follow this conversation buried in an inline code comment for a specific pull request). Also, I don't think we need to resolve the "flipVideo" question as part of facemesh since it applies across the board.

Thanks all!!!


function preload() {
// Load the facemesh model
facemesh = ml5.facemesh(options);
}

function setup() {
createCanvas(640, 480);
// Create the webcam video and hide it
video = createCapture(VIDEO);
video.size(width, height);
video.hide();
// Start detecting faces from the webcam video
facemesh.detectStart(video, gotFaces);
}

function draw() {
// Draw the webcam video
image(video, 0, 0, width, height);

// Draw all the tracked face points
for (let i = 0; i < faces.length; i++) {
let face = faces[i];
for (let j = 0; j < face.keypoints.length; j++) {
let keypoint = face.keypoints[j];
fill(0, 255, 0);
noStroke();
circle(keypoint.x, keypoint.y, 5);
}
}
}

// Callback function for when facemesh outputs data
function gotFaces(results) {
// Save the output to the faces variable
faces = results;
//console.log(faces);
}
14 changes: 14 additions & 0 deletions examples/Facemesh-parts/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>ml5.js Facemesh p5 Webcam Example</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/1.6.0/p5.js"></script>
<script src="../../dist/ml5.js"></script>
</head>
<body>
<script src="sketch.js"></script>
</body>
</html>
73 changes: 73 additions & 0 deletions examples/Facemesh-parts/sketch.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
// Copyright (c) 2023 ml5
//
// This software is released under the MIT License.
// https://opensource.org/licenses/MIT

let facemesh;
let video;
let faces = [];
let options = { maxFaces: 1, refineLandmarks: false, flipHorizontal: false };

function preload() {
// Load the facemesh model
facemesh = ml5.facemesh(options);
}

function setup() {
createCanvas(640, 480);
// Create the webcam video and hide it
video = createCapture(VIDEO);
video.size(width, height);
video.hide();
// Start detecting faces from the webcam video
facemesh.detectStart(video, gotFaces);
}

function draw() {
// Draw the webcam video
image(video, 0, 0, width, height);

drawPartsKeypoints();
drawPartsBoundingBox();
}

// Draw keypoints for specific face element positions
function drawPartsKeypoints() {
// If there is at least one face
if (faces.length > 0) {
for (let i = 0; i < faces[0].lips.length; i++) {
let lips = faces[0].lips[i];
fill(0, 255, 0);
circle(lips.x, lips.y, 5);
}
}
}

// Draw bounding box for specific face element positions
function drawPartsBoundingBox() {
// If there is at least one face
if (faces.length > 0) {
let lipsX = [];
let lipsY = [];
for (let i = 0; i < faces[0].lips.length; i++) {
// Find the lips
let lips = faces[0].lips[i];
lipsX.push(lips.x);
lipsY.push(lips.y);
}
noFill();
rect(
min(lipsX),
min(lipsY),
max(lipsX) - min(lipsX),
max(lipsY) - min(lipsY)
);
}
Copy link
Member

@gohai gohai Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@B2xx @UpKindLikeWater The API design you selected has keys for different landmarks (.lips.) - but inside, you chose to keep the original structure of having arrays of points. This makes it very consistent with the "vanilla" model, but it also makes it a bit harder to use for simple tracking applications - since now it's up to the caller (user) to calculate e.g. the center point, or the extent of this landmark. (That's quite a bit of code above...)

Perhaps we could something like this instead:

[{
  lips: {
    cx: 300,   // center point calculated from all the points
    cy: 300,
    w: 10,     // width calculated from all the points
    h: 3,
    points: [  // actual points (if you want to draw them)
     { ... },
     { ... },
    ]
  },
  ...
}]

@shiffman Would this make sense to you also?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gohai yes, I like this idea! I think cx and cy might be a little confusing, if we just use x and y is that clear? Should we set them at the top left or the center? I would also perhaps opt for width and height instead of w and h.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also use centerX and centerY, but maybe x y width height that are aligned with the default "rectMode" makes the most sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not very sure about this... having instant access to the center point could make a lot of fun examples even more concise. (e.g. drawing a circle over your nose, drawing a line between the eyes, calculating the distance between mouse and...) But I understand the call to not mixing rectModes in APIs.

How about we provide both? 🤔

  lips: {
    x: 250,   // top left
    y: 250,
    width: 100,
    height: 100,
    centerX: 300,
    centerY: 300,
    points: [  // actual points (if you want to draw them)
     { ... },
     { ... },
    ]
  },

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to provide both because I think it is useful for facemesh models that contains so many points in one face element.

I think this structure will also affect the output of bodypose and handpose models, maybe @ziyuan-linn will have some thoughts as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both sounds great!!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too sure if bodypose and handpose have any bounding boxes. There might be one for the entire body and entire hand, which is less useful than the facemesh bounding boxes. I don't really have a strong preference for this. I can see some cases where x, y, width, height is more useful and other cases where centerX, centerY is more useful, so keeping both might be a good idea.

An alternative to keeping both is to keep x and y at the top left corner by default and add a function facemesh.rectMode() which can switch x and y to center mode. However, this might become confusing and may introduce unnecessary complexity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes @ziyuan-linn I agree, I don't think we need to add this for handpose or bodypose, I think it's more specific to the way people may want to use facemesh and the complexity of so many keypoints.

}

// Callback function for when facemesh outputs data
function gotFaces(results) {
// Save the output to the faces variable
faces = results;
//console.log(faces);
}
Binary file added examples/Facemesh-single-image/face.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 14 additions & 0 deletions examples/Facemesh-single-image/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>ml5.js Facemesh p5 Image Example</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/1.6.0/p5.js"></script>
<script src="../../dist/ml5.js"></script>
</head>
<body>
<script src="sketch.js"></script>
</body>
</html>
44 changes: 44 additions & 0 deletions examples/Facemesh-single-image/sketch.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
// Copyright (c) 2023 ml5
//
// This software is released under the MIT License.
// https://opensource.org/licenses/MIT

let facemesh;
let img;
let faces = [];
let options = { maxFaces: 1, refineLandmarks: false, flipHorizontal: false };

function preload() {
// Load the image to be detected
img = loadImage("face.png");
// Load the facemesh model
facemesh = ml5.facemesh(options);
}

function setup() {
createCanvas(640, 480);
// Draw the image
image(img, 0, 0);
// Detect faces in an image
facemesh.detect(img, gotFaces);
}

function draw() {
// Draw all the face keypoints
for (let i = 0; i < faces.length; i++) {
let face = faces[i];
for (let j = 0; j < face.keypoints.length; j++) {
let keypoint = face.keypoints[j];
fill(0, 255, 0);
noStroke();
circle(keypoint.x, keypoint.y, 1.5);
}
}
}

// Callback function for when facemesh outputs data
function gotFaces(results) {
// Save the output to the faces variable
faces = results;
//console.log(faces);
}
2 changes: 2 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,12 @@
"webpack-dev-server": "^4.15.1"
},
"dependencies": {
"@mediapipe/face_mesh": "^0.4.1633559619",
"@mediapipe/hands": "^0.4.1675469240",
"@mediapipe/pose": "^0.5.1675469404",
"@mediapipe/selfie_segmentation": "~0.1.0",
"@tensorflow-models/body-segmentation": "^1.0.1",
"@tensorflow-models/face-landmarks-detection": "^1.0.5",
"@tensorflow-models/hand-pose-detection": "^2.0.0",
"@tensorflow-models/pose-detection": "^2.1.0",
"@tensorflow/tfjs": "^4.2.0",
Expand Down
Loading