The document summarizes research on using an order relaxed accuracy (ORA) metric to evaluate the performance of a temporal point process model. Key points:
- ORA allows some flexibility in prediction order by counting predictions as correct if the event class is predicted within a set number (r) of future events.
- ORA was tested on synthetic and real datasets, showing it performs better than accuracy on noisy data and is less forgiving than time relaxed accuracy.
- When applied to a real dataset of Twitter users during a coordinated campaign, ORA identified a small number of highly predictive users, suggesting coordinated or repetitive behaviors.
- Future work involves analyzing the behaviors of highly predictive users and adding noise to the real dataset
The document summarizes research on using an order relaxed accuracy (ORA) metric to evaluate the performance of a temporal point process model. Key points:
- ORA allows some flexibility in prediction order by counting predictions as correct if the event class is predicted within a set number (r) of future events.
- ORA was tested on synthetic and real datasets, showing it performs better than accuracy on noisy data and is less forgiving than time relaxed accuracy.
- When applied to a real dataset of Twitter users during a coordinated campaign, ORA identified a small number of highly predictive users, suggesting coordinated or repetitive behaviors.
- Future work involves analyzing the behaviors of highly predictive users and adding noise to the real dataset
The document summarizes research on using an order relaxed accuracy (ORA) metric to evaluate the performance of a temporal point process model. Key points:
- ORA allows some flexibility in prediction order by counting predictions as correct if the event class is predicted within a set number (r) of future events.
- ORA was tested on synthetic and real datasets, showing it performs better than accuracy on noisy data and is less forgiving than time relaxed accuracy.
- When applied to a real dataset of Twitter users during a coordinated campaign, ORA identified a small number of highly predictive users, suggesting coordinated or repetitive behaviors.
- Future work involves analyzing the behaviors of highly predictive users and adding noise to the real dataset
Isura Manchanayaka Doctor of Philosophy – Engineering and IT [056957F] Order Relaxed Accuracy • For each event (𝑡, 𝑒) in the true sequence of events that appears in the 𝑖-th position, • Check if there are any events of class 𝑒 appear in the next 𝑟 events, where 𝑟 is arbitrarily chosen and called relaxation • If there is, it is counted as a correct prediction otherwise ignored !"##$!% &#$'(!%(")* • Relaxed accuracy metric is calculated as %"%+, )-./$# "0 $1$)%* Questions • How does the order relaxed accuracy perform on noisy synthetic data? • How does the order relaxed accuracy perform on random data? • How does the order relaxed accuracy perform on real data? Synthetic Data – n=10 Synthetic Data – n=10 Synthetic Data – n=10 Synthetic Data – n=10 Moving to large datasets • IRA dataset • 8.76 million tweets • 3613 users Coordinated Activity Extracting an Active Time Window • Starting from 2014-08-14 • Ending in 2014-08-28 • 885 users • 264k tweets Performing RMTPP on the dataset • Time error: 16.503 s • Accuracy: 1.16% Pruning Retweet Network • A retweet network is built using the data • Nodes – Users • Edges – Number of retweets between users. The edges are undirected. • The nodes are iteratively pruned based on the sum of the edge weights belonging to that node • If ∑2∈4 𝑤{6,2} ≤ 𝑒9, then 𝑢 is pruned from the graph • This is iteratively done until no node is pruned Results IRA Data (Pruned Randomly Generated Relaxation IRA Data – 885 users with threshold 400) – Data – 57 users 57 users 0 1.16% 2.59% 1.79% 5 1.64% 3.07% 1.80% 10 1.82% 3.31% 1.80% 100 2.18% 4.27% 1.80% 1000 2.38% 4.69% 1.80% Order Relaxed Accuracy • For each user in the true sequence of events, number of correctly predicted times is divided by the number of occurrences that user appears in the true sequence of events is used to obtain order relaxed accuracy per user Results – Number of users with individual relaxed accuracy more than some given value (IRA Data – Without Pruning)
Accuracy Lower Relaxation =
Relaxation = 0 Relaxation = 10 Relaxation = 100 Bound 1000 80% 1/885 (0.11%) 1/885 (0.11%) 2/885 (0.11%) 5/885 (0.11%) 50% 1/885 (0.11%) 4/885 (0.45%) 5/885 (0.56%) 5/885 (0.56%) 20% 3/885 (0.34%) 4/885 (0.45%) 5/885 (0.56%) 5/885 (0.56%) 10% 4/885 (0.45%) 5/885 (0.56%) 5/885 (0.56%) 5/885 (0.56%) 5% 4/885 (0.45%) 5/885 (0.56%) 5/885 (0.56%) 6/885 (0.68%) 0% 5/885 (0.56%) 5/885 (0.56%) 5/885 (0.56%) 8/885 (0.90%) Results – Number of users with individual relaxed accuracy more than some given value (IRA Data – Pruned with threshold 400) Accuracy Lower Relaxation = Relaxation = 0 Relaxation = 10 Relaxation = 100 Bound 1000 80% 0/57 (0%) 0/57 (0%) 2/57 (3.51%) 2/57 (3.51%) 50% 1/57 (1.75%) 2/57 (3.51%) 2/57 (3.51%) 2/57 (3.51%) 20% 2/57 (3.51%) 2/57 (3.51%) 2/57 (3.51%) 2/57 (3.51%) 10% 2/57 (3.51%) 2/57 (3.51%) 2/57 (3.51%) 2/57 (3.51%) 5% 2/57 (3.51%) 2/57 (3.51%) 2/57 (3.51%) 2/57 (3.51%) 0% 2/57 (3.51%) 2/57 (3.51%) 2/57 (3.51%) 2/57 (3.51%) Analysis • ORA gives a better metric for predictions on noisy data • ORA is less forgiving about the errors in time since timestamp is irrelevant, but the order is • ORA is more natural than the Time Relaxed Accuracy since 𝑟 = 0 of ORA corresponds to regular sequential accuracy • A small portion of the users are dominant in the predictions which implies highly predictive behaviors Future Work • Need to check those highly predictive users for their activities • Need to add noisy data to IRA dataset Thank You