You are on page 1of 2

Subject: Progress Update and Inquiry on Handling Model Performance in Empirical Research

Dear Professor Lu,

I hope this email finds you well. I apologize for the delayed contact. During this period, I have
been addressing revisions for previously submitted articles and adjusting the content of my
graduation thesis based on feedback from my supervisor. In order to avoid any disruption to our
collaboration, I aim to finalize the framework within the next two weeks and commence writing.

Regarding the progress of our collaborative article, I have some exciting news to share. Our
submission to BMC Methodology has been successfully accepted. In this article, we utilized the
generalized SEM method, as discussed in our previous communication, to address outlier
detection issues in cross-sectional spatial data with spatial heterogeneity. Our model outperformed
others in terms of outlier detection performance and coefficient estimation. This success reinforces
my confidence in our future research endeavors, and I am aiming to submit our new article to
Statistics in Medicine. I have attached the recently published paper for your reference.

Regarding the preliminary results you mentioned as "interesting" last time, I used the same
generalized approach but with a slight variation. I directly specified the parameter μ without
estimation, aiming to observe the comparative performance of our model against others under the
assumption of accurate estimation of μ. I carefully reviewed your suggestion on estimating μ, and
I find it highly practical. In the next algorithm iteration, I plan to incorporate the estimation of μ
and convergence criteria into the process, alongside two other parameters (or one parameter,
omitting the intercept term for potential generality), in a cyclic iterative manner until convergence
is achieved.

I would like to report a point for your consideration: In the article "Robust penalized estimators
for functional linear regression," the selection of penalty parameters is based on RCV(λ) method.
If combined with the outer loop for μ, it seems that the algorithmic complexity may increase
significantly. But the impact on overall computational time might not be substantial. I intend to
conduct extensive simulations to confirm this aspect.

After finalizing the framework for my graduation thesis (maybe in 2 weeks), I will dedicate my
efforts to our collaborative article. I have been contemplating your suggestion of prioritizing a
specific dataset for research, particularly exploring cross-sectional health-related spatial data with
spatial heterogeneity. However, I have a question that I would like to seek your advice on. In the
scenario where we have selected a dataset for our study and the simulation results all have been
done, but upon empirical research, our model underperforms compared to others, how should we
proceed? Should we consider changing the dataset or refrain from emphasizing model
performance as the primary focus of the article? I tend to plan our study by outlining research
questions, methodologies, and desired conclusions ( Not a preset conclusion ) , considering
various scenarios to ensure smooth progress. I realize this may be a trivial question, and I would
greatly appreciate your guidance on this matter.
Thank you for your support and mentorship throughout our collaboration. I look forward to your
insights on the above query.

Warm regards,

[Your Name]

You might also like