You are on page 1of 1

FLOW

ACADEMIC RESEARCH POSTER TEMPLATE


Filtering and LSTM-based Optimization for Web Browser
Subtitle for Academic Research Poster (36x48 inches) Interactions
Dongyijie Pan, Zhengnan Li, Guangzheng Fei From Communication University of China
Your names and the names of the people who contributed to this presentation
Long Ling from Tongji University

Introduction Although the 1 Euro filter significantly optimizes tailing, jitter, and Experiments and Results
latency in interactions, this pre-machine learning era algorithm
Web browsers have become an indispensable tool for can be substantially empowered today by incorporating some We conducted two experiments, focusing on Mediapipe
human-computer interactions across various platforms. neural network-based temporal trajectory prediction algorithms. Hands across various devices. The standard for
In temporal trajectory prediction, LSTM can be employed to comparison was set using the GPU-accelerated Python
However, interactive performance (e.g the immediacy of
implement a ’Seq2Seq’ approach, where we use past trajectory version of Mediapipe. We compared the errors by
interaction, the accuracy of interaction data) in browser- data to predict future trajectory data. We can certainly build calculating the Mean Squared Error (MSE) of each key
based apps can be hindered by hardware limitations and an ’N to1’ prediction model, which predicts the next coordinate node, post-'FLOW' algorithm processing, against the
position based on the previous N steps, but such a sliding standard answer. It was observed that after optimization
network delays, resulting in unwanted jitter and input signal
window prediction method is too time-consuming under low with the FLOW algorithm, the accuracy and performance
delays that negatively impact the user experience. To tackle computational power. We recommend web developers to turn of the browser-based Mediapipe Hands improved
these challenges, we propose a novel approach that the 1 euro filter for reducing significant jitter, and then perform significantly compared to before the optimization.
latency statistics on the filtered data. For example, under a fixed
leverages effective filtering algorithms and lightweight sampling rate, we observe how many frames the filtered data
machine learning models. By integrating the 1 euro filter and lags behind the ground truth. We found that using the maximum
lag value multiplied by 1.5 or 2 as input to the LSTM yields
Long Short-Term Memory (LSTM) algorithm, we successfully
significantly improved prediction accuracy.
optimize the accuracy of Mediapipe-based gesture For the challenge of determining an uncertain frame delay, we
interactions and tangible table application within web innovatively use the alpha parameter in the filter to adaptively
select the predictive frames generated by the LSTM. By
browsers. Our findings demonstrate that this combined
applying a linear mapping on the prediction sequence using the
approach not only enhances input signal accuracy but also alpha parameter, this approach effectively addresses some
ensures seamless and highly responsive web browser overfitting issues that arise when using a fixed frame as the
predictive frame.
interactions.

For the application of interactive table tokens in


browsers, we divided the movement of the tokens into
three steps, based on the characteristics closely related
to errors occurring in physical interactions, such as
sudden acceleration and deceleration.

The Mediapipe hand landmark Fast movements result in pronounced According to the theory of Jota et al. , we categorized the
detection in the browser faces on-screen smearing in TUIO based collected data into three stages according to the
inaccuracy in many scenarios tangible interaction Web App. The acceleration of the token in each frame. 1:initial reaction,
rendered position could not catch the Paradigm for Data Preprocessing and Training in FLOW 2:large ballistic motion, 3:feedback adjusted final
token at high speed. adjustments. Based on these three criteria, we have
organized the data into the following table: original data,
data after 1 Euro filter processing, data after prediction
using LSTM + mean latency frame, and data after 𝛼-
Methods mapped frame and subsequent filtering

We innovatively combined the 1 Euro filter with the LSTM


algorithm, and with the aid of Tensorflow.js, we implemented
a lightweight filtering plus RNN algorithm on the browser side.
This serves to optimize a variety of issues encountered in
browser-based interactions.
Compared to traditional Kalman filters, the 1 Euro filter is
more user-friendly for HCI researchers in terms of parameter
tuning. It only has two adjustable parameters and can
smooth interaction data in real-time with O(1) time
complexity during interactions. This makes it a small yet
elegant filtering algorithm in the field of Human-Computer
Interaction. Conclusion
FLOW provides a feasible and experimentally implementable paradigm to
address jitter and latency in complex human-computer interaction
applications on the browser platform. The 1 euro filter serves as a
preprocessor before using neural networks, and it dynamically selects
predicted values and further smooths the output results after the neural
network's predictions. Moreover, our optimized and integrated algorithm
maintains a lightweight profile in terms of both time and space complexity.
In application scenarios where prompt human-computer interaction
feedback is crucial, our algorithm effectively addresses the two issues
encountered in the experiment. We believe it holds the potential to be
applied in various interactive applications on the browser platform.
In the coming years, we would have the opportunity to incorporate more
complex yet powerful machine learning models into browsers.At the same
time, the HCI researchers and their achievements, like $1 Unistroke
Recognizer, before the surge of machine learning algorithms also remind
us that apart from feeding interaction data into black-box-like hidden layers,
Application-Level Processing Paradigm of FLOW we humans still possess countless ingenious ideas to solve seemingly
challenging problems using concise code and remarkably low time
complexity. Together, we will expand the possibilities of the browser, the
most fundamental medium for communication between humans and
computers, and humans and the internet. The potential for the future is
immense.

Long Ling Guangzheng Fei


Dongyijie Pan Zhengnan Li
From:TJU From:CUC
From CUC From CUC
Email: Email:
Email: Email:
lucyling0224@gmail.com gzfei@cuc.edu.cn
primojaypan@cuc.edu.cn lzhengnan389@gmail.com

You might also like