Add object detection implementation by douaaz · Pull Request #195 · ml5js/ml5-next-gen

douaaz · 2024-08-22T12:13:15Z

No description provided.

gohai · 2024-08-25T08:13:40Z

Hi @douaaz & @SherabCodes - I'll be looking through your changes momentarily, and giving feedback. Excited..! 🎉

gohai · 2024-08-25T08:20:58Z

package.json

    "@tensorflow-models/pose-detection": "^2.1.0",
    "@tensorflow-models/speech-commands": "^0.5.4",
-    "@tensorflow/tfjs": "^4.2.0",
+    "@tensorflow/tfjs": "^4.20.0",


Note for @ziyuan-linn: coco-ssd indeed lists tfjs 4.20.0 as its peer dependency. Hope that bumping this won't cause any regressions..

src/ObjectDetector/index.js

examples/objectDetector-single-image copy/sketch.js

examples/soundClassifier-speech-command/handPose-single-image copy/sketch.js

examples/objectDetector-single-image copy/Printable Bill.pdf

examples/objectDetector-single-image copy/sketch.js

examples/objectDetector-webcam/sketch.js

gohai · 2024-08-25T08:49:35Z

examples/objectDetector-webcam/sketch.js

+ * This example demonstrates object detection on an image through ml5.objectDetector.
+ */
+
+let objectDetector;


For structure, I would suggest copying how e.g. BodyPose-keypoints is doing it (using detectStart()). This leads to a much more concise example code, with fewer states that the user has to think about!
I quickly tried this approach with your code, and it appeared to work well!

gohai · 2024-08-25T09:00:02Z

src/ObjectDetector/index.js

+    const predictions = await this.model.detect(image);
+    console.log('raw result from cocoSsd', predictions);
+
+    const result = predictions;


Perhaps worth having a look at the previous version of ml5 to see in which format it returned data to the user. Not that you necessarily need to imitate it, but not so sure about bbox..!

From a quick look, previously the result was an array of objects, each with the following properties:

label: String, confidence: Number, x: Number (px), y: Number (px), width: Number (px), height: Number (px), normalized: { x: Number (0-1), y: Number (0-1), width: Number (0-1), height: Number (0-1), }

(the normalized part is perhaps less common in ml5, and perhaps we could drop this - but the rest seems to be similar to e.g. faceMesh returns results presently)

src/ObjectDetector/index.js

shiffman · 2024-08-25T19:28:36Z

This is so exciting to see, if it's ok, I will leave a few comments as well!

shiffman · 2024-08-25T19:31:29Z

examples/objectDetector-dimLight/sketch.js

This is a lovely implementation, but I wonder if it's outside the scope of a basic set of Object Detection examples since it's not more about "image processing" than the model itself. Perhaps a separate tutorial could be written for the community page about pre-processing images since it could be applied across many models, not just object detection.

Regardless, if we were to keep this example I would suggest rewriting it with p5.js functions rather than the native JS code which will be unfamiliar to beginners.

shiffman · 2024-08-25T19:33:15Z

examples/objectDetector-single-image copy/objects.jpg

Generally speaking, I think it's preferable to demonstrate examples without people in them. Who is this person? Do we have permission to use their face? Perhaps there is an image that can demonstrate a wider variety of classes inside the model.

shiffman · 2024-08-25T19:45:41Z

examples/objectDetector-single-image copy/sketch.js

+    let x = object.bbox[0];
+    let y = object.bbox[1];
+    let w = object.bbox[2];
+    let h = object.bbox[3];


Do we want to consider renaming the underlying properties that come out of the model, something like the following might be more intuitive:

let x = object.bbox.x let y = object.bbox.y; let w = object.bbox.w; let h = object.bbox.h;

This also offers the opportunity for "destructuring" though this is perhaps a concept less familiar to beginners:

let { x, y, w, h } = object.bbox;

gohai · 2024-09-03T12:06:40Z

src/ObjectDetector/index.js

+
+   // await mediaReady(image, false);
+
+   // const predictions = await this.model.detect(image);


The two lines above look like they were commented-out by mistake?

enkatsu · 2025-05-17T12:03:27Z

Hi all, thank you for the great work on this PR!
I just wanted to kindly ask — what’s the current status of this pull request?
I'm really looking forward to seeing this object detection feature merged, and if there's anything I can do to help move it forward (resolving conflicts, testing, etc.), I’d be more than happy to contribute.
Apologies if this comes across as rushing — I truly appreciate the work being done here!
Thanks again 🙏

shiffman · 2025-05-17T21:40:36Z

Hi @enkatsu thanks for question and your interest and enthusiasm! This is on the agenda for summer research and we hope to have something complete by the fall term, stay tuned! The summer work on ml5.js will begin at the start of June so check back then!

enkatsu · 2025-05-18T03:39:44Z

Hi @shiffman, thanks for the update! That’s great to hear — I’m excited to see how things develop over the summer.
Appreciate all your hard work!

shiffman · 2025-07-30T14:42:18Z

This PR is now superceded by #257 by @yiyujin which we are merging shortly and will make it into a new release soon (maybe not until the fall semester starts.) Thank you @douaaz and contributors, your research and work on this really helped us get the final version across the finish line!

douaaz and others added 7 commits August 2, 2024 13:14

add cocossd model and object dectection example

762a698

update objectDetector example and index.js files

3123c21

update image

f53a371

update cocossd implementation

00e349d

add box for detected objects

53c6b1f

add webcam example for object detection

79cf58e

.

120b681

douaaz requested review from gohai and ziyuan-linn August 22, 2024 12:13

gohai reviewed Aug 25, 2024

View reviewed changes

shiffman reviewed Aug 25, 2024

View reviewed changes

douaaz added 6 commits September 2, 2024 09:09

changes

f1535b0

changes

23dc069

changes

cc08e77

changes after yarn run format

0dbb20e

editing comments

603514e

remove unnecessary code

436e505

gohai reviewed Sep 3, 2024

View reviewed changes

shiffman closed this Jul 30, 2025


		// await mediaReady(image, false);

		// const predictions = await this.model.detect(image);

Conversation

douaaz commented Aug 22, 2024

Uh oh!

gohai commented Aug 25, 2024

Uh oh!

gohai Aug 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gohai Aug 25, 2024

Choose a reason for hiding this comment

Uh oh!

gohai Aug 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shiffman commented Aug 25, 2024

Uh oh!

shiffman Aug 25, 2024

Choose a reason for hiding this comment

Uh oh!

shiffman Aug 25, 2024

Choose a reason for hiding this comment

Uh oh!

shiffman Aug 25, 2024

Choose a reason for hiding this comment

Uh oh!

gohai Sep 3, 2024

Choose a reason for hiding this comment

Uh oh!

enkatsu commented May 17, 2025

Uh oh!

shiffman commented May 17, 2025

Uh oh!

enkatsu commented May 18, 2025

Uh oh!

shiffman commented Jul 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants