<?xml version="1.0"?>
<!DOCTYPE ArticleSet PUBLIC "-//NLM//DTD PubMed 2.0//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/static/PubMed.dtd">
<ArticleSet>
  <Article>
    <Journal>
      <PublisherName>Sichuan Knowledgeable Intelligent Sciences</PublisherName>
      <JournalTitle>International Scientific Technical  and Economic Research </JournalTitle>
      <Issn>2959-1309</Issn>
      <Volume>4</Volume>
      <Issue>2</Issue>
      <PubDate PubStatus="epublish">
        <Year>2026</Year>
        <Month>04</Month>
        <Day>02</Day>
      </PubDate>
    </Journal>
    <ArticleTitle>Research on Long Sequence Learning Behavior Modeling Based on Transformer-XL</ArticleTitle>
    <FirstPage>1</FirstPage>
    <LastPage>20</LastPage>
    <ELocationID EIdType="doi">10.71451/ISTAER2613</ELocationID>
    <Language>eng</Language>
    <AuthorList>
      <Author>
        <FirstName>Yuxiao</FirstName>
        <LastName>Qin</LastName>
        <Affiliation>Education College, Seoul School of Integrated Sciences and Technologies, Seoul, Republic of Korea</Affiliation>
        <Identifier Source="ORCID">0009-0001-5420-5269</Identifier>
      </Author>
    </AuthorList>
    <History>
      <PubDate PubStatus="received">
        <Year>2026</Year>
        <Month>04</Month>
        <Day>02</Day>
      </PubDate>
    </History>
    <Abstract>
In view of the difficulties in modeling long-sequence dependence and the high computational complexity of online learning behavior data, this paper proposes a long sequence learning behavior modeling method based on Transformer-XL. This method improves the performance of the model from the two levels of structure and information modeling by constructing multidimensional behavior feature representation, integrating dynamic memory enhancement mechanism, behavioral semantic perception attention and sparse long sequence modeling strategy. Experimental results on real educational datasets such as ASSISTments and EdNet show that the proposed model is superior to the mainstream methods in terms of AUC, ACC and RMSE. The AUC increases by about 4.2% and RMSE decreases by about 8.1%. Further ablation experiments and parameter analysis verify the effectiveness of each module. Cross dataset experiments and noise tests show that the model has good generalization ability and robustness. In addition, interpretability analysis shows that the model can effectively focus on key learning behaviors. The results show that this method has significant advantages in the long sequence learning behavior modeling task, and provides effective support for personalized recommendation and learning state evaluation in intelligent education system.
</Abstract>
  </Article>
</ArticleSet>
