Interactive input is a very important function in the engine's functional layer. It allows users to interact with the application using devices, touch, or gestures. In the 0.6 milestone, we initially built the interaction system of the Galacean Engine, which currently supports click and keyboard. This article will share the thoughts and shortcomings during the development process.
Input devices, touch, XR devices, etc., all belong to the input of the interaction system. We gather all the logic of the input in the InputManager, and subdivide it into specific inputs such as PointerManager and KeyBoardManager according to various types of input. The InputManager manages all specific input managers. In the frame processing of interaction, it only needs to process the logic of specific inputs within each manager.
The lifecycle of frame processing is as follows:
The internal lifecycle of InputManager is as follows:
Script
component to add appropriate logic.
| Interface | Trigger Timing and Frequency |
| --- | --- |
| onPointerEnter | Triggered once when the touch point enters the collision volume of the Entity |
| onPointerExit | Triggered once when the touch point leaves the collision volume of the Entity |
| onPointerDown | Triggered once when the touch point is pressed within the collision volume of the Entity |
| onPointerUp | Triggered once when the touch point is released within the collision volume of the Entity |
| onPointerClick | Triggered once when the touch point is pressed and released within the collision volume of the Entity |
| onPointerDrag | Continuously triggered when the touch point is pressed within the collision volume of the Entity until the touch point is no longer pressed |Directly call the methods provided by the InputManager to determine the key status.
Method Name | Method Explanation |
---|---|
isKeyHeldDown | Returns whether the key is being held down continuously |
isKeyDown | Returns whether the key was pressed in the current frame |
isKeyUp | Returns whether the key was released in the current frame |
PointerEvent is the trend of subsequent development of mouse and touch interaction in browsers. Pointer is a hardware layer abstraction of input devices. Developers do not need to care whether the data source is a mouse, touchpad, or touchscreen. However, it also has some compatibility issues. As can be seen in canIUse, the device coverage rate of PointerEvent is 92.82%, which needs to be solved by importing Polyfill.
Add hook functions that respond to Pointer in the script component. For entities with collision volume in three-dimensional space, developers can easily implement interactions such as click, drag, and select by supplementing the logic in the corresponding hook functions.
Hook Function | Trigger Timing and Frequency |
---|---|
onPointerEnter | Triggered once when the touch point enters the Entity's collider range |
onPointerExit | Triggered once when the touch point leaves the Entity's collider range |
onPointerDown | Triggered once when the touch point is pressed within the Entity's collider range |
onPointerUp | Triggered once when the touch point is released within the Entity's collider range |
onPointerClick | Triggered once when the touch point is pressed and released within the Entity's collider range |
onPointerDrag | Continuously triggered when the touch point is pressed within the Entity's collider range until the touch point is no longer pressed |
Similar to MouseEvent and TouchEvent, PointerEvent can also be captured by listening.
canvas.addEventListener('pointerXXX', callBack);
MouseEvent | TouchEvent | PointerEvent | |
---|---|---|---|
Press | mousedown | touchstart | pointerdown |
Release | mouseup | touchend | pointerup |
Move | mousemove | touchmove | pointermove |
Leave | mouseout | mouseleave | touchend | touchcancel | pointerout | pointercancel | pointerleave |
The general process of Pointer handling can be summarized, where the green boxes represent native events.
The biggest problem to solve in Pointer is how to perform raycasting in three-dimensional space based on the position information from native events. This part not only includes basic knowledge of spatial transformation but also the basic use of the physics system.
After capturing the PointerEvent, we need to:
We aim to get the position of the pointer relative to the target element, but there are many coordinate properties in native events, so we need to identify which coordinate information is valid.
Native Event Coordinate Properties | Property Explanation |
---|---|
clientX & clientY | Coordinates relative to the application area that triggered the event (viewport coordinates) |
offsetX & offsetY | Coordinates relative to the target element |
pageX & pageY | Coordinates relative to the entire Document (including scroll areas) |
screenX & screenY | Coordinates relative to the top-left corner of the main display (rarely used) |
x & y | Same as clientX & clientY |
They have the following conversion relationships (assuming the native event is event
and the clicked target element is canvas
):
The conclusion is: most coordinate properties can provide the desired coordinate information, with offset being the most direct and convenient.
Simplify raycasting by obtaining a ray in three-dimensional space from the coordinates clicked on the screen and then performing collision detection with colliders in three-dimensional space.
Taking a perspective camera as an example, after obtaining the coordinates clicked on the screen, the following steps are needed to get the ray:
Those familiar with graphics engines know that during rendering, we undergo the following transformations:
It seems that we only need to get the screen space coordinates and then perform the inverse transformations of several space transformations.
It is necessary to have a general understanding of pixels (pixel), device-independent pixels (dips), and device pixel ratio (devicePixelRatio). The coordinate information obtained from the offset property in the click event is in device-independent pixels, so when calculating screen space coordinates, ensure that the units of the numerator and denominator are consistent.
裁剪空间是 XYZ 范围皆在 -1 到 1 的左手坐标系(裁剪空间可以形象地理解为当渲染范围超出这个区间就会被裁减),此处转换时需注意:
公示推导中矩阵为列为主序。
以透视相机为例,世界空间经过 View 变换和 Project 变换即可转换到裁剪空间,那么从裁剪空间转换到世界空间只需要经历这些变换的逆即可。
上式中代入近平面深度与远平面深度依次求得触摸点在世界坐标空间下近平面与远平面的投影点,连接这两个点即可得到检测射线。
碰撞体由规则几何体组成(长方体,球体等)可以查阅相关射线与几何体相交算法。
当物理引擎返回命中的碰撞体后,可以认为它的 Entity
这就是当前帧的所有onPointerXXX
回调的当事人
了,在这个环节只需要根据收集的原生事件进行脚本回调即可。
正如开篇提到的兼容性问题,如果你的项目可能运行在低系统版本的机器中,可以导入我们定制的 PointerPolyFill 。 https://github.com/galacean/polyfill-pointer-event
KeyBoardEvent 可以通过监听捕获。
canvas.addEventListener('keyXXX', callBack);
事件 | 触发时机 |
---|---|
keypress | 字符键按下时触发 |
keydown | 任意键按下时触发 |
keyup | 任意键抬起时触发 |
The general process of keyboard handling can be summarized, where the green boxes represent native events.
Regardless of different case states or keyboard layouts, a key is an enumerable value. If the key value can be stored as an enumeration, it will bring great convenience in terms of both performance and usage. Therefore, it is necessary to determine the appropriate property to use as the enumeration value.
The following are properties in KeyEvent that can be used as enumeration values:
Property | Description | Simple Example | Compatibility |
---|---|---|---|
code | The physical key that triggered the event, layout-independent | Regardless of case or layout, when you press the Y key, the return is always the physical key "KeyY" | Compatible |
key | The key value that triggered the event | "y" when lowercase, "Y" when uppercase | Compatible |
charCode | Deprecated | ||
keyCode | Deprecated | ||
char | Deprecated |
It can be found that the most suitable property is code. Refer to https://w3c.github.io/uievents-code/.
The interaction logic of each frame's key press is relatively simple. Maintaining three arrays for pressed, released, and held keys can meet all needs. The focus is on how to reduce the performance loss of frame-level add, delete, and search operations.
Unordered arrays reduce performance loss when adding and deleting elements in most cases. The following diagram shows the composition of unordered arrays:
The following diagram shows how unordered arrays reduce performance loss:
If only three unordered arrays are used, it still requires traversing the array to check the state of a specific key, which can cause significant performance loss in extreme cases. If the current key enumeration is used as the Key and whether it was pressed in the current frame is recorded as the Value, traversal can be avoided.
Although this implementation makes querying faster, it adds maintenance costs — the state of the mapping table needs to be reset at the beginning of each frame. However, if the frame number is saved, this cost can be perfectly avoided by updating the frame number at the beginning of each frame.
Similarly, when recording keys that are held down, an additional table is added to map the key to its index in the unordered array.
Key State | isKeyHeldDown | isKeyDown | isKeyUp |
---|---|---|---|
The key has been held down since the previous frame | true | false | false |
The key was pressed in the current frame and not released | true | true | false |
The key was released and pressed again in the current frame | true | true | true |
The key was pressed and released in the current frame | false | true | true |
The key was released in the current frame | false | false | true |
The key was not pressed and had no interaction | false | false | false |
This situation will not occur | true | false | true |
This situation will not occur | false | true | false |
It looks like you haven't pasted the Markdown content yet. Please provide the content you want translated, and I'll help you with the translation while adhering to the rules you've specified.