TAKEMETOTHEINTERNET

it

Hand Gesture Recognition with MediaPipe

MediaPipe is a library developed by Google that allows features such as hands, body, and face to be tracked in real time using a simple webcam. The library uses machine learning to analyze the image and generate the specific coordinates of the tracked features.

To start using MediaPipe within Cables, it is necessary to load the “MediaPipe” extension containing the operators needed to interact with the library.

In this article we will cover the steps required to create an interactive application through hand tracking.

First, insert a WebcamTexturenode to access the webcam and connect the texture output to a FullScreenRectangle to display it in the canvas, set the Scale in the latter operator settings to stretch. At this point we will have our image displayed.

Now you need to send the webcam video stream to MediaPipe. Enter the operator MpHandTrackingand connect it to the CSS Element output of WebcamTexture to receive the tracking coordinates of both hands. Then select one of the two hands through the MpHand operator. In output we will have a "points" array inside which we will find the coordinates of all the tracking points of the selected hand. To extract the position of a specific point use the MpHandCoordinate node and choose one of the available points within the "Joint" menu.

Before you begin to see anything, you need to link an element that you can move. Immediately afterFullScreenRectangle connect a PixelProjection operator. Cables uses a different coordinate system than Mediapipe, so we will adapt the coordinates of our canvas to those received from MediaPipe. In the settings of PixelProjection set Size to Manual and enter a Width and Height of 2. Then center the coordinate origin in "Position 0,0" by setting it to Center.

At this point connect a BasicMaterialand a Circle. The canvas now has the image of our webcam and a colored circle in the center of the screen. Let's set a radius of 0.1 to make the circle smaller.

Most likely the circle will be “squashed”, this is because the proportions of the shape are directly related to those of the canvas, resizing the canvas will also change the perspective ratio of the circle. To prevent this problem you can link the canvas's Aspect Ratio to the circle's scale via the CanvasInfo operator and Scale.

Now we need to make the circle move. To do this we need to link the coordinates provided by to theMpHandCoordinate through a Transform. Connect the X and Y outputs of MpHandCoordinate to the respective posX and posY in the Transform node.

Play with MediaPipe

Once you become familiar with how MediaPipe works, the possibilities for artistic development are limitless.The key is to experiment. By modifying patches even slightly you can create new combinations all the time.

For example, we can further develop the patch just seen to add other types of interaction and elements. In this case instead of the circle we will connect a letter of the alphabet to the position of the index finger and thumb. By extending and bringing the fingers closer together we will increase the size of the element and change the letter, starting from A when the fingers touch, to Z at maximum extension.

Although it may seem complex at first glance, it is only a small step up from the previous patch. First we need to add a MpHandCoordinate to control the thumb position as well. Replace the Circle with a TextMesh for now leaving it on the letter “A”.

Our “A,” however, will continue to follow the index finger to which it was previously connected. We need to find the midpoint between the two fingers. To do this we need to average the respective coordinates. We will then have :

"posX = (X(index) + X(thumb))/2 , posY = (Y(index) + Y(thumb))/2"

Now our letter will be perfectly centered between the two fingers. The next step is to scale it relatively to the distance between the two fingers. Fortunately Cables provides a Distance2Doperator. By connecting the X and Y coordinates of both points we will output the distance between the two. We then connect the output of Distance2D to the Scale input in theTransformnode.

Bringing the index finger and thumb closer together will scale the letter accordingly. Now we need to link the distance between the fingers to the letters of the alphabet. Conceptually we should create an array containing all 26 letters (in the case of the English alphabet, but any type of alphabet can be used), and map the distance between the two fingers to the index finger in the array, so from 0 to 25.

First add a StringToArray operator, in the input text enter all the letters of the alphabet, one per line. In the operator settings, uncheck Numbers and turn on Split Lines . Next, add a ArrayGetString to select a single letter by index and link it to the input text of the TextMesh operator.

All that remains at this point is to link the Distance2D operator to the input index of ArrayGetString . , but to do this we need to remap the output of the first node. The value we will receive from Distance2D travels in a range between about 0 and 1, while the index of the letter array is between 0 and 25. We can use the operator MapRange and set it with the following values oldMin = 0.1 oldMax = 0.7 newMin=0 newMax=25 . Then add an operator Round to round the decimal value to an integer (The index of an array is always an integer).