CAT历年论文

| 分类 CAT 

爬取了北大的毕业论文仓库,根据导师姓名检索,可能有其它专业的论文。

这篇文章主要用于全文检索,查看起来不是很方便。

Table of Contents

2019-05-30

基于深度学习的自动句法纠错研究.黄浩洋

链接

题名:

 基于深度学习的自动句法纠错研究    

姓名:

 黄浩洋    

学号:

 1601210559    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 俞敬松    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-30    

外文题名:

 Deep learning based automatic grammer error correction    

关键词:

 自然语言处理 自动语法改错 平行语料 深度学习 预训练    

外文关键词:

 Natural language processing Automatic grammar correction Parallel corpus Deep learning Pre-training    

论文摘要:

    自动语法改错(GEC),是自然语言处理中句法分析中较为困难的任务之一。在日常对话中,语法上的细微差别对于一个非母语的人来说是最困难掌握与理解的,当前自然语言中的语法改错不仅包含语法错误,也包含拼写与搭配错误。
    近年来,随着深度学习的发展,自动语法改错任务得到了不少关注。基于统计机器翻译(SMT)的短语相关方法,是将GEC 看做一个翻译任务:从“坏”转换到“好”,所用的语料也是类似翻译语料的平行语料。不同于SMT 依赖于递归神经网络(RNN),也有通过卷积神经网络(CNN)来进行句子编码,提取以短语为基础的语义空间表征。这些方法都是通过建立端到端(encoder-decoder)的序列到序列(seq2seq)模型,理解
错误句子与正确句子之间的语义以及词语表述的差异来定位语法错误。
    为了进一步充分学习数据中的知识,通过监督学习(supervised learning)方式是最常见的。该方法需要大量标注数据,但是标注成本巨大。学者们发现可以利用非标注(unlabeled)数据进行非监督学习,通过挖掘其中有价值语义信息帮助其他的监督任务理解。其中有利用基于翻译语料的预训练模型,也有利用长文本语料进行语言模型的预训练,还有利用多任务结合的泛化性预训练模型。这些预训练模型都在许多任务
上经过检验,可以对模型表现有很大的提升。
    虽然自动改错模型可以借助比较新颖的模型架构,但是由于自动改错语料的缺失,更大范围的自动改错以及具有实际应用价值的自动改错模型建设依然不理想。而本次研究不仅提出了一种新的堆叠模型结构,同时该结构可将预训练的丰富语义信息的特征嵌入,得到一种可适配多种预训练方法的多层自动纠错模型。模型不仅可以进行多轮迭代解决改错难题,同时为了进一步缓解自动改错语料不足,利用了对偶学习方法产生更多额外训练数据。整体纠错框架不仅可以帮助理解词语之间的相关性、短语的
连贯性、语义的匹配性,还有句子语法准确性。
    阶段式的模型结构,使得模块能高度可替换且可扩充。同时目前已经开源平行纠错语料以及实际改错样例表明,该模型不仅可以在学术数据集取得很不错的效果还能应用到实际场景。本文模型框架还能进一步融合目前最新的预训练模型权值,具有很强的可扩展性,这是其他所有工作所不具备的。使得本次研究更有意义以及未来研究价值。

外文摘要:

   Automatic grammar correction (GEC) is one of the most difficult tasks in syntactic analysis in natural language processing. In daily conversations, grammatical nuances are the most difficult to grasp and understand for a non-native speaker. The grammatical corrections in current natural language include not only grammatical errors, but also spelling and collocation errors.
    In recent years, with the development of deep learning, the task of automatic grammar correction has received a lot of attention. The phrase-related method based on statistical machine translation (SMT) is to regard GEC as a translation task: from "bad" to "good", the corpus used is parallel corpus similar to translation corpus. Different from SMT, which relies on recurrent neural network (RNN), there are also convolutional neural networks (CNN) for sentence coding and extraction of phrase-based semantic spatial representation. These methods locate grammatical errors by establishing an encoder-decoder sequence-to-sequence (seq2seq) model that understands the semantics between erroneous sentences and correct sentences and the differences in word expressions.
    In order to further fully learn the knowledge in the data, supervised learning is the most common. This method requires a lot of annotation data, but the cost of labeling is huge. Scholars have found that unsupervised learning can be performed using unlabeled data to help other supervisory tasks understand by mining valuable semantic information. Among them are pre-training models based on translation corpus, pre-training using long text corpus, and generalized pre-training model using multi-task. These pre-training models have been tested on many tasks and can greatly improve the performance of the model.
    Although the automatic error correction model can be based on a relatively new model architecture, the automatic error correction and the automatic error correction model with practical application value are still not ideal due to the lack of automatic error correction corpus. This study not only proposes a new stacked model structure, but also embeds the features of pre-trained rich semantic information to obtain a multi-layer automatic error correction model that can adapt to multiple pre-training methods. The model can not only solve the problem of correcting errors by multiple rounds of iteration, but also use the dual learning method to generate more additional training data in order to further alleviate the lack of automatic error correction corpus. The overall error correction framework can not
only help to understand the correlation between words, the coherence of phrases, the matching of semantics, but also the accuracy of sentence grammar.
    The staged model structure makes the module highly replaceable and expandable. At the same time, the open source parallel error correction corpus and the actual error correction
examples show that the model can not only achieve good results in the academic data set but also apply to the actual scene. The model framework of this paper can further integrate the current pre-training model weights, which is highly scalable, which is not available in all other work. Make this study more meaningful and future research value.

分类号:

 TP3    

论文总页数:

 65    

参考文献总数:

 50    

参考文献列表:
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.

Gers, Felix. Long short-term memory in recurrent neural networks. Diss. Verlag nicht ermittelbar,2001.

Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[J]. Computer Science, 2014.

Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017. [3] Bengio Y, Schwenk H, Senécal J S, et al. Neural Probabilistic Language Models[J]. Journal of Machine Learning Research, 2003, 3(6):1137-1155.

Chowdhury, Gobinda G. Introduction to modern information retrieval. Facet publishing, 2010.

Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013). [6] Wang Z, Hamza W, Florian R. Bilateral Multi-Perspective Matching for Natural Language Sentences[J]. 2017.

Pennington, Jeffrey, Richard Socher, and Christopher Manning. "Glove: Global vectors for word representation." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014.

Collobert, Ronan, and Jason Weston. "A unified architecture for natural language processing: Deep neural networks with multitask learning." Proceedings of the 25th international conference on Machine learning. ACM, 2008.

Le Q V, Mikolov T. Distributed Representations of Sentences and Documents[J]. 2014, 4:II-1188.

Kiros R, Zhu Y, Salakhutdinov R, et al. Skip-Thought Vectors[J]. Computer Science, 2015.

Yuan, Zheng, and Ted Briscoe. "Grammatical error correction using neural machine translation." Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016.

Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014.

Freitag, Markus, and Yaser Al-Onaizan. "Beam search strategies for neural machine translation." arXiv preprint arXiv:1702.01806 (2017).

Gu, Jiatao, et al. "Incorporating copying mechanism in sequence-to-sequence learning." arXiv preprint arXiv:1603.06393 (2016).

Kaiser, Łukasz, and Samy Bengio. "Can active memory replace attention?." Advances in Neural Information Processing Systems. 2016.

Kalchbrenner, Nal, et al. "Neural machine translation in linear time." arXiv preprint arXiv: 1610.10099 (2016).

He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016).

Radford, Alec, et al. "Improving language understanding by generative pre-training." URL https://s3-us-west-2. amazonaws. com/openai-assets/research-covers/languageunsupervised/language understanding paper. pdf (2018).

Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554.

Poultney, Christopher, Sumit Chopra, and Yann L. Cun. "Efficient learning of sparse representations with an energy-based model." Advances in neural information processing systems.2007.

McCann, Bryan, et al. "Learned in translation: Contextualized word vectors." Advances in Neural Information Processing Systems. 2017.

Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018).

Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014).

Rocktäschel, Tim, et al. "Reasoning about entailment with neural attention." arXiv preprint arXiv:1509.06664 (2015).

Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

Junczys-Dowmunt, Marcin, and Roman Grundkiewicz. "Phrase-based machine translation is state-of-the-art for automatic grammatical error correction." arXiv preprint arXiv:1605.06353 (2016).

Chollampatt, Shamil, and Hwee Tou Ng. "A multilayer convolutional encoder-decoder neural network for grammatical error correction." Thirty-Second AAAI Conference on Artificial Intelligence. 2018.

Junczys-Dowmunt, Marcin, et al. "Approaching neural grammatical error correction as a low-resource machine translation task." arXiv preprint arXiv:1804.05940 (2018).

Sennrich, Rico, Barry Haddow, and Alexandra Birch. "Neural machine translation of rare words with subword units." arXiv preprint arXiv:1508.07909 (2015).

Dauphin, Yann N., et al. "Language modeling with gated convolutional networks." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.

Ge, Tao, Furu Wei, and Ming Zhou. "Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study." arXiv preprint arXiv:1807.01270 (2018).

Rico Sennrich, Barry Haddow, and Alexandra Birch. Improving neural machine translation models with monolingual data. In ACL, 2016.

He, Di, et al. "Dual learning for machine translation." Advances in Neural Information Processing Systems. 2016.

Mizumoto, Tomoya, et al. "Mining revision log of language learning SNS for automated Japanese error correction of second language learners." Proceedings of 5th International Joint Conference on Natural Language Processing. 2011.

Dahlmeier, Daniel, Hwee Tou Ng, and Siew Mei Wu. "Building a large annotated corpus of learner English: The NUS corpus of learner English." Proceedings of the eighth workshop on innovative use of NLP for building educational applications. 2013.

Ng, Hwee Tou, et al. "The CoNLL-2014 shared task on grammatical error correction." Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. 2014.

Napoles, Courtney, Keisuke Sakaguchi, and Joel Tetreault. "Jfleg: A fluency corpus and benchmark for grammatical error correction." arXiv preprint arXiv:1702.04066 (2017).

Dahlmeier, Daniel, and Hwee Tou Ng. "Better evaluation for grammatical error correction." Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2012.

Napoles, Courtney, et al. "Ground truth for grammatical error correction metrics." Proceedings of
the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Vol. 2. 2015.

Papineni, Kishore, et al. "BLEU: a method for automatic evaluation of machine translation." Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002.

Sutskever, Ilya, et al. "On the importance of initialization and momentum in deep learning." ICML (3) 28.1139-1147 (2013): 5.

Felice, Mariano, and Zheng Yuan. "Generating artificial errors for grammatical error correction." Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics. 2014.

Rozovskaya, Alla, and Dan Roth. "Grammatical error correction: Machine translation and classifiers." Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2016.

Junczys-Dowmunt, Marcin, and Roman Grundkiewicz. "The AMU system in the CoNLL-2014 shared task: Grammatical error correction by data-intensive and feature- rich statistical machine translation." Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. 2014.

Ji, Jianshu, et al. "A nested attention neural hybrid model for grammatical error correction." arXiv preprint arXiv:1707.02026(2017).

Xie, Ziang, et al. "Noising and denoising natural language: Diverse backtranslation for grammar correction." Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Vol. 1.2018.

Grundkiewicz, Roman, and Marcin Junczys-Dowmunt. "Near human-level performance in grammatical error correction with hybrid machine translation." arXiv preprint arXiv:1804.05945
(2018).

王建翔. 面向可读性评估的词向量技术研究及实现 [D]. 南京, 中国 : 南京大学, 2017.

宗成庆. 统计自然语言处理 [M]. 北京, 中国 : 清华大学出版社, 2013.
公开日期:

 2019-06-11    

2019-05-29

基于自然语言处理的学生英文检错规则抽取研究.杨越

链接

题名:

 基于自然语言处理的学生英文检错规则抽取研究    

姓名:

 杨越    

学号:

 1601210810    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 俞敬松    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-29    

外文题名:

 Research on the Extraction of English Error Detection Rules based on Natural Language Processing    

关键词:

 关键词:作文检错 规则抽取 规则匹配 自然语言处理    

外文关键词:

 Composition correction Rules extraction Rules matching Natural language processing    

论文摘要:

已有二语习得研究表明,提供有效校正反馈,有利于提高第二语言学习者语言水平。目前市面上也出现了一些英语写作的纠错工具,例如国外有LanguageTool、Grammarly,国内有批改网等软件。这些工具大多局限于英文写作中的单词拼写错误、语法错误,当涉及中式英语、搭配错误、句型错误、含义模糊等偏主观错误时,主要依靠人工制定规则进行识别。另外,虽然已有LanguageTool等开源工具可以进行错误识别,但是不能针对规则特点灵活进行适配和更改,且识别速度较慢。针对以上问题,本研究提出利用已标注的学习者语料,从中半自动地抽取检错规则,然后自行开发轻量级的规则匹配器来验证和应用规则。

在本研究中,首先提取了英文改错规则。通过对已标注的学习者语料库CLEC和NUCLE进行详尽分析,确定可由程序自动抽取的错误类别;通过Java程序设计算法实现规则的初步提取,并且将抽取结果写入MySQL数据库。并且对抽取之后的规则进行测试和验证,通过人工方式筛选规则。最后合理利用牛津搭配词典、Google Books等语料资源对现有规则进行延伸,以达到通过订正错误来帮助学习者学习英语的目的。

其次,笔者设计和实现了轻量级规则匹配器。本匹配器是针对抽取出的规则进行设计,可以对半自动抽取的规则表进行验证,也可证明从学习者语料库中半自动抽取规则的可行性。

本研究的成果是通过科学方法从学习者语料库中抽取英文改错规则,识别准确率达90%以上;并且对规则进行了预处理,为后续专家校正提供了可靠依据,减少了时间成本;另外设计并实现了轻量级的规则匹配器,作为LanguageTool的补充,将速度提升30%以上,可以高效处理各种自定义规则。研究表明,此半自动生成规则与应用的方式,提高了效率,节省了人力,能够给英语学习者以帮助。同时此项目具有通用性和易扩展性,对于其他学习者语料库或语料资源,可以很好地进行扩展和进一步研究。

外文摘要:

The studies on the learn of the second language has shown that providing effective correction feedback is beneficial for the learners to develop the ability of learning the second language. At present, there are some error correction tools for English writing on the market, such as the foreign LanguageTools and Grammarly, and the domestic Pigai. org, etc. However, most of these tools are limited to the word spelling errors and grammatical errors in English writing, while the retrieval is mainly relied on manual rules when the rules involve subjective errors, such as chinglish, mismatches, sentence pattern errors and ambiguous meanings and so on. Although the open source tools, such as LanguageTool, etc., can identify the errors, they are not appropriate to adapt and change the rules flexibly. Besides, the recognition speed is slow, too. Regarding the issue above, this study aims to use the annotated learner corpus to extract the the error detection rules semi-automatically, and then develop a lightweight rule matcher to verify and apply the extracted rules. 

The first part is the extraction of English correction rules. Firstly, the error categories that can be automatically extracted by the program are determined through the detailed systematic analysis of the existing and annotated learner corpus CLEC and NUCLE. Secondly, the initial extraction of the rules is implemented by the design algorithm of Java program and the extraction results are written into the MySQL database. In addition, the rules after extraction are tested and verified, and then filtered manually. Finally, the resources, such as Oxford collocation dictionary and Google books, etc., are used to extend the existing rules, so as to help the learners learn English through correcting errors. 

The second part is the design and implementation of lightweight rule matcher. This matcher is developed to design the extracted rules and the existing rule base. On the one hand, the semi-automatically extracted rule table can be verified conveniently. On the other hand, the feasibility of the rules that are semi-automatically extracted from the learner corpus can be proved. 

The result of this research is that English error correction rules can be extracted from the learner corpus through scientific methods, with an accuracy rate of over 90%. Moreover, the rules are preprocessed, which provides a reliable basis for the subsequent experts to perform correction and reduce the time cost. Furthermore, the lightweight rule matcher is designed and implemented, which can be taken as a complement to LanguageTool, making the speed increase more than 30% to efficiently handle various customized rules. The studies have shown that the rules generated semi-automatically and the mode of application can improve efficiency, save manpower and help English learners. At the same time, this project has universality and extensibility, so it can extend and further research the future learner corpora or other resources. 

分类号:

 TP3    

论文总页数:

 60    

参考文献总数:

 40    

参考文献列表:
[1] 赵东阳. 语料库方法与二语习得界面研究综述[J]. 海外英语(上), 2017(10).
[2] 刘蕾, 海娜. 网络英文写作在大学英语教学中的应用研究[J]. 海外英语, 2018.
[3] 张加加. 初探中介语理论[J]. 赤子, 2014.
[4] Ashwell T. Patterns of Teacher Response to Student Writing in a Multiple-Draft Composition Classroom: Is Content Feedback Followed by Form Feedback the Best Method? [J]. Journal of Second Language Writing, 2000, 9(3):227-257.
[5] Scanlon M J. Improving Student Writing Through Multiple Peer Feedback[C]// Frontiers in Education Conference. IEEE, 2013.
[6] 周一书. 大学英语写作反馈方式的对比研究[J]. 外语界, 2013(3):87-96.
[7] Yu Yang. An Empirical Study on the Effects of Self-Correction Based on the Pigai Network on College EFL Students' Writing Proficiency[A]. 东北亚语言文学与翻译国际学术论坛组委会. Proceedings of the Sixth Northeast Asia International Symposium on Language,Literature and Translation[C].东北亚语言文学与翻译国际学术论坛组委会:辽宁省翻译学会, 2017:6
[8] Smet M J R D, Broekkamp H, Brand-Gruwel S, et al. Effects of electronic outlining on students’ argumentative writing performance[J]. Journal of Computer Assisted Learning, 2011, 27(6):557-574
[9] CORDERS S P. The significance of learner’s error[J]. International Review of Applied Linguistic, 1967 (4) :161-170.
[10] 桂诗春. 以语料库为基础的中国学习者英语失误分析的认知模型[J]. 现代外语, 2004, 27(2).
[11] 蔡龙权, 戴炜栋. 错误分类的整合[J]. 外语界, 2001(4):52-57.
[12] 李悦, 吴敏, 吴桂兴, et al. 基于最大熵模型的介词纠错系统[J]. 计算机系统应用, 2016, 25(1):96-100.
[13] 林燕. 基于n-gram的英语文章的自动检查[J]. 信息化建设, 2016(6).
[14] 谭咏梅, 杨一枭, 杨林, et al. 基于LSTM和N-gram的ESL文章的语法错误自动纠正方法[J]. 中文信息学报, 2018, v.32(06):24-32.
[15] Shei C C, Pain H. An ESL Writer’s Collocational Aid[J]. Computer Assisted Language Learning, 2000, 13(2):167-182.
[16] 项炜, 金澎. 大规模语料库上的Stanford和Berkeley句法分析器性能对比分析[J]. 电脑知识与技术, 2013(8).
[17] 杨国基, 梁洪峻. 自然语言处理中基于短语结构的语法分析方法[J]. 微处理机, 2009, 30(6):74-77.
[18] Marneffe M C D, Manning C D. The Stanford typed dependencies representation[C]// Coling: Workshop on Cross-framework & Cross-domain Parser Evaluation. 2008.
[19] Gospodnetic. LUCENE IN ACTION[J]. Action, 2010.
[20] NUS Natural Language Processing Group. Data of NUS Corpus of Learner English[DB/OL]. (2014) https://www.comp.nus.edu.sg/~nlp/corpora.html
[21] Ng H T, Wu Siewmei, Wu Yuanbin, et al. The Co NLL-2013 shared task on grammatical error correction. Proceedings of the Seventeenth Conference on Computational Natural Language Learning. August 8-9, 2013.1-12.
[22] 赵新城. 中国学习者英语作文中的词类失误现象分析——一项基于中国学习者英语语料库的实证调查[J]. 北京第二外国语学院学报, 2008, 31(8).
[23] 杨惠中. 基于CLEC语料库的中国学习者英语分析[M]. 上海外语教育出版社, 2005.
[24] Mi kowski M. Developing an open-source, rule-based proofreading tool[J]. Software-Practice and Experience, 2010, 40 (7) :543-566.
[25] 姜赢, 曾杰, 林启红, et al. LanguageTool中文语法校对XML规则定制方法[J]. 图书情报工作, 2014, 58(5):86-92.
[26] 王秀娟. 文本检索中若干问题研究[D]. 北京邮电大学, 2006.
[27] 文继军, 王珊. SEEKER:基于关键词的关系数据库信息检索[J]. 软件学报, 2005, 16(7):1270-1281.
[28] Oxford University press. The British National Corpus (BNC) [DB/OL]. (1980s-1990s) https://corpus.byu.edu/bnc/.
[29] 李小撒, 王文宇. WordNet与BNC介入下的第二语言心理词汇联系模式实证研究[J]. 语言科学, 2016, 15(1).
[30] 葛诗利. 面向大学英语教学的通用计算机作文评分和反馈方法研究[M]. 上海外语教育出版社, 2015.
[31] Islam A, Inkpen D, Islam A, et al. Real-word spelling correction using Google Web IT 3-grams[C]// International Conference on Natural Language Processing & Knowledge Engineering. IEEE, 2009.
[32] JonathanCrowther. 牛津英语搭配词典[M]. 外语教学与研究出版社, 2003.
[33] 叶莹, 徐海女. 英语学习型词典搭配信息表征的创新趋势研究——以《牛津高阶英语词典》(1-8版)为例[J]. 辞书研究, 2014(6):45-51.
[34] Mark Davies. Collocates data[DB/OL]. (2018). https://www.collocates.info/purchase_iweb.asp.
[35] Fellbaum C. WordNet[M]// The Encyclopedia of Applied Linguistics. 2012.
[36] 李少锋, Ellis R, 束定芳. 纠错反馈时机对不同二语水平学习者的教学效果研究(英文)[J]. 外语与外语教学, 2016(1):1-14.
[37] 麻秀丽. “错误提示”英语写作教学法研究[J]. 中国教育学刊, 2013(s2):57-58.
[38] 李奕华. 基于动态评估理论的英语写作反馈方式比较研究[J]. 外语界, 2015(3):59-67.
[39] Powell H. Teaching Language: From Grammar to Grammaring[J]. Tesol Quarterly, 2012, 38(1):172-173.
[40] 朱晔. 反馈信息与知识状态的互动与效果[J]. 现代外语, 2014(2).
公开日期:

 2019-06-14    

基于深度学习的视频行为识别研究.常志勇

链接

题名:

 基于深度学习的视频行为识别研究    

作者:

 常志勇    

学号:

 1601210438    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 俞敬松    

导师单位:

 计算机科学技术研究所    

答辩日期:

 2019-05-29    

关键字(中文):

 行为识别 深度学习 全局机制 局部密集连接    

文摘:

近年来互联技术逐渐变得成熟,尤其是智能手机和一些数码设备的普及,令网络 上覆盖着大量的视频信息,面对急剧增长的视频数量,一些含有暴力和色情的视频内 容被肆意传播,这给青少年的身心健康带来了一定的危害,并且也给网络的监管带来 了巨大的压力。由于监控视频数量不断地增长,互联网上视频数量的持续增长令人们 对视频内容的理解以及视频中人体行为分析的需求也在不断地增加。使用计算机不仅 能够更好地理解视频中的内容,而且能够避免人们花费大量的时间对视频进行分析。

深度学习在计算机视觉领域做出了很大的贡献。将深度神经网络在大规模数据集 上进行训练使得深度学习方法在目标检测,图像分类和视频中的人体动作识别等领域 都达到了较好的效果。由于深度学习对图像数据具有很好地抽象建模能力以及能够自 动提取图像特征,而视频可看成是一系列的图像帧堆叠而成。所以对于本文研究的对 视频中的人物行为进行识别的技术采用深度学习方法来进行探索。

本文的主要工作内容如下:

现有的基于双流卷积网络的行为识别方法中用的卷机网络大部分是 BN-Inception 结构或者是 VGG 结构,这样的结构参数量较大不易于网络的训练,因此本文采用 Densenet 结构来分别提取视频的空间信息和时间信息,原有的 Densenet 结构采用的是 全局的密集连接方式,即网络中的 Dense 块中的某一层都与其它层互相连接,这样容 易造成特征冗余且参数量较大,并且由于每一层的输入都是之前所有层输出的特征映 射的拼接,所以在网络的前向传播和反向传播的过程中都要存储这些中间层的特征映 射,所以原有的 Densenet 在训练过程中占有的内存较大。本文针对上述问题,对原有 的 Densenet 做出改进参首先将原有的 Densenet 中的每一层互相连接改成局部连接,也 就是每一层只与之前的一些层部分连接,这大大减少了模型在学习过程中需要训练的 参数量。并且采用共享内存的方式减少模型占有的内存。其次,现有的双流卷机网络 最后再对人体行为进行预测时是将两个网络的结果加权平均,这样没有更好的利用视 频的时空信息,所以本文通过将提取到的视频信息在空间维度和时间维度上进行合并。

分类号:

 TP3    

论文总页数:

 53    

参考文献数:

 46    

参考文献:
[1] heng wang, alexander kläser, cordelia schmid,等. dense trajectories and motion boundary deors for action recognition[j]. international journal of computer vision, 2013, 103(1):60-79.
[2] pun t, pun t. a new method for gray-level picture threshold using the entropy of the histogram[j]. signal processing, 1985, 29(3):223-237.
[3] dalal n, triggs b, schmid c. human detection using oriented histograms of flow and appearance[j]. 2006.
[4] wang h , klaser a , schmid c , et al. action recognition by dense trajectories[j]. proceedings / cvpr, ieee computer society conference on computer vision and pattern recognition. ieee computer society conference on computer vision and pattern recognition, 2011.
[5] shu z , yun k , samaras d . action detection with improved dense trajectories and sliding window[j]. 2014.
[6] mironici, du i c, ionescu b, et al. a modified vector of locally aggregated deors approach for fast video classification[j]. multimedia tools & applications, 2016, 75(15):9045-9072.
[7] tirilly p, claveau v, gros p. language modeling for bag-of-visual words image categorization[c]// 2008.
[8] mika s, ratsch g, weston j, et al. fisher discriminant analysis with kernels[c]// neural networks for signal processing ix, ieee signal processing society workshop. 2002.
[9] bicego, manuele, lagorio, et al. on the use of sift features for face authentication[c]// computer vision & pattern recognition workshop. 2006.
[10] suykens j a k . support vector machines: a nonlinear modelling and control perspective.[j]. european journal of control, 2001, 7(2-3):311-327.
[11] lv f, nevatia r. recognition and segmentation of 3-d human action using hmm and multi-class adaboost[m]// computer vision – eccv 2006. 2006.
[12] karpathy a , toderici g , shetty s , large-scale video classification with convolutional neural networks[c]// 2014 ieee conference on computer vision and pattern recognition (cvpr). ieee, 2014. [13] donahue j , hendricks l a , guadarrama s , et al. long-term recurrent convolutional networks for visual recognition and deion[m]// ab initto calculation of the structures and properties of molecules /. elsevier, 2015.
[14] simonyan k , zisserman a . two-stream convolutional networks for action recognition in videos[j]. 2014.
[15] carreira j, zisserman a. quo vadis, action recognition? a new model and the kinetics dataset[j]. 2018.
[16] wang l , xiong y , wang z , et al. temporal segment networks for action recognition in videos[j]. 2017.
[17] ioffe s , szegedy c . batch normalization: accelerating deep network training by reducing internal covariate shift[c]// international conference on international conference on machine learning. jmlr.org, 2015.
[18]soomro k , zamir a r , shah m . ucf101: a dataset of 101 human actions classes from videos in the wild[j]. computer science, 2012.
[19] kuehne h , jhuang h , garrote e , et al. [ieee 2011 ieee international conference on computer vision (iccv) - barcelona, spain (2011.11.6-2011.11.13)] 2011 international conference on computer vision - hmdb: a large video database for human motion recognition[c]// ieee international conference on computer vision. dblp, 2011:2556-2563.
[20] zhu j , arbor a , hastie t . multi-class adaboost[j]. statistics & its interface, 2006, 2(3):349-360. [21] deng j , dong w , socher r , et al. imagenet: a large-scale hierarchical image database[c]// 2009 ieee conference on computer vision and pattern recognition. ieee, 2009.
[22] hinton g e . rectified linear units improve restricted boltzmann machines vinod nair[c]// international conference on international conference on machine learning. omnipress, 2010.
[23] clevert, djork-arné, unterthiner t , hochreiter s . fast and accurate deep network learning by exponential linear units (elus)[j]. computer science, 2015.
[24] he k , zhang x , ren s , et al. deep residual learning for image recognition[j]. 2015.
[25] x. glorot and y. bengio. understanding the difficulty of training deep feedforward neural networks. in aistats, 2010.
[26] hochreiter s, schmidhuber j. long short-term memory[j]. neural computation, 1997, 9(8):1735-1780.
[27] a. krizhevsky, i. sutskever, and g. hinton. imagenet classification with deep convolutional neural networks. in nips, 2012.
[28] m.lin,q.chen,ands.yan.network in network.arxiv:1312.4400, 2013.
[29] szegedy c , liu w , jia y , et al. going deeper with convolutions[c]// 2015 ieee conference on computer vision and pattern recognition (cvpr). ieee, 2015.
[30] k. simonyan and a. zisserman. very deep convolutional networks for large-scale image recognition. in iclr, 2015.
[31] huang g, liu z, laurens v d m, et al. densely connected convolutional networks[j]. 2016.
[32] szegedy c, ioffe s, vanhoucke v, et al. inception-v4, inception-resnet and the impact of residual connections on learning[j]. 2016.
[33] szegedy c , vanhoucke v , ioffe s , et al. [ieee 2016 ieee conference on computer vision and pattern recognition (cvpr) - las vegas, nv, usa (2016.6.27-2016.6.30)] 2016 ieee conference on computer vision and pattern recognition (cvpr) - rethinking the inception architecture for computer vision[j]. 2016:2818-2826.
[34] ji s , xu w , yang m , et al. 3d convolutional neural networks for human action recognition[j]. ieee transactions on pattern analysis and machine intelligence, 2013, 35(1):221-231.
[35] zeiler m d , fergus r . stochastic pooling for regularization of deep convolutional neural networks[j]. eprint arxiv, 2013.
[36] yue k , xu f , yu j . shallow and wide fractional max-pooling network for image classification[j]. neural computing and applications, 2017.
[37] srivastava n, hinton g, krizhevsky a, et al. dropout: a simple way to prevent neural networks from overfitting[j]. journal of machine learning research, 2014, 15(1):1929-1958.
[38] brox t, bruhn a, papenberg n, et al. high accuracy optical flow estimation based on a theory for warping[m]// computer vision - eccv 2004. 2004.
[39] schuster m , paliwal k k . bidirectional recurrent neural networks[j]. ieee transactions on signal processing, 1997, 45(11):2673-2681.
[40] Lécun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.
[41] Vinyals O, Toshev A, Bengio S, et al. Show and tell: A neural image caption generator[C]// IEEE Conference on Computer Vision & Pattern Recognition. 2015.
[42] Bahdanau D, Chorowski J, Serdyuk D, et al. End-to-End Attention-based Large Vocabulary Speech Recognition[J]. Computer Science, 2015:4945-4949.
[43] Guo H, Wu X, Wei F. Multi-stream Deep Networks for Human Action Classification with Sequential Tensor Decomposition[J]. Signal Processing, 2017:S0165168417301937.
[44] Srivastava R K, Greff K, Schmidhuber J. Highway Networks[J]. Computer Science, 2015.
[45] Sermanet P, Eigen D, Zhang X, et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks[J]. Eprint Arxiv, 2013.
[46]Zhou B , Khosla A , Lapedriza A , et al. Learning Deep Features for Discriminative Localization[J]. 2015.







公开日期:

 2022-06-11    

辅助写作的语料库查询系统设计与实现.胡盖蕾

链接

题名:

 辅助写作的语料库查询系统设计与实现    

姓名:

 胡盖蕾    

学号:

 1601210568    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 俞敬松    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-29    

外文题名:

 Design and Implementation of an Corpus Query System for Writing Assistance    

关键词:

 语料库 辅助阅读 辅助写作 语料查询 系统设计与实现    

外文关键词:

 Corpus Assisting Reading Assisting Writing Corpus Query System Design And Implementation    

论文摘要:

英语写作是国内学生英语能力的短板。目前,国内学校开设的英语写作公共课程的教学效果相对有限,市面上虽然有针对英语写作的学习网站和书籍,但大多都是模板句型等资源的汇集展示,向学习者即时提供的指导不具备针对性。各类英语写作辅助工具主要提供作文的机器评阅和自动打分,仅能帮助用户发现作文中的常见错误,对于正确的内容无法予以改进指导。

对写作学习者来说,参考已有的专业或优秀的行文表达是写作学习的有效途径,语料库作为真实的语言资源知识库可以在写作学习及教学中提供可信赖的指导意见。但现阶段语料库的设计大部分面向学术研究,普通的教师和学生使用起来并不方便,具体表现在:(1)功能繁多、查询参数的设置较为复杂,用户学习使用的成本较高;(2)经常会出现前几条检索结果是长难句的情况,用户的阅读体验不佳;(3)查询结果中难免包含复杂的词汇和语句,对于普通学习者来说会有一定的阅读负担。

本研究将从辅助英语写作的角度出发,面向普通学生和老师设计并实现一个语料库查询系统,使用户可以便捷有效地获取语料信息、并利用语料库查询手段帮助发现和改正作文中的错误及不足。

本文首先期望解决目前语料库查询系统对于普通英语学习者的易用性问题,包括功能使用不便和语料阅读困难。本系统针对这些问题实现了语料库基础检索模块、检索结果重构模块和句子辅助阅读模块。基础检索模块提供常用的语料库查询功能,且具备简易查询模式,可以提升语料库使用的便捷性;检索结果重构模块的主要目的在于提升语料阅读体验,将对例句按照从易到难的顺序进行排序,并对查询结果中的用户陌生词汇进行特殊显示;句子辅助阅读模块旨在帮助用户习得语料,将提供句子的机器翻译结果、句法拆解结果和简单句等信息。

本文接下来研究了语料库查询在写作场景中的应用,着眼于解决国内学生在英文写作中常出现的词汇搭配僵化、搭配表达偏口语化等具体问题实现了语料库查询辅助写作/批改模块。该模块提供了作文搭配抽取、搭配丰富程度分析和搭配校验三个功能,可以利用语料库数据来帮助用户审阅与改进作文中的搭配使用。

经测试,检索结果重构有效提高了用户对语料的阅读兴趣;句子辅助阅读在长难句理解方面的辅助效果得到了被试者的一致认可;语料库辅助写作可以实际改善作文中词汇搭配错误及搭配重复使用的问题,在对10篇作文进行修改后,作文在批改网的得分平均提高了1.9分(满分100),最高提升了5分。

外文摘要:

English writing is a major problem for Chinese students. English writing classes opened by domestic schools have relatively limited effects. Although there are websites and books aiming to help students write English articles, most of them are simply the displays of resources like sentence patterns. They do not provide users with adaptive instructions. Assisted tools for English writing mostly can not instruct learners to reach a higher level of English writing as they can not provide guidance on right sentences, all they do is examining articles, picking out common mistakes and offering grades.

Referring to idiomatic articles is a valid way for learners to improve writing ability. Corpus, being a data bank of language, can provide reliable guidance for writing learning and teaching. But the corpuses nowadays are mostly designed for academic purposes which do not fit the goals of common teachers and students. Actually, teachers and students may find such corpuses inconvenient when using them because: 1) They have redundant functions and complicated query parameter settings so that they are hard for users to learn. 2) It is often the case that the first few search results are long and difficult, which gives users a hard time in reading. 3) Complex words and sentences in the query results can be a burden for average learners.

Motivated by the idea of assisting English writing, this study designs and implements a corpus query system for ordinary students and teachers Using the system, users can easily obtain corpus information and use corpus query methods to help identify and improve deficiencies in the composition.

This article wishes to tackle the usability problems including the inconvenience in using as well as reading corpus data. To overcome these obstacles, the system is designed with basic query module, query results automatic reconstructing module and assisting sentence reading module. Basic query module was embedded with simple query mode which makes the corpus more convenient to use. Result reconstruction module aims at enhancing reading experience, it can reconstruct query results by sorting the example sentences in order of difficulty and highlight the unfamiliar words. Assisting sentence reading module provides syntactic splitting results (including clause recognition and collocation), machine translation results and information of simple sentences etc.

This study also explores the application of corpus query in English writing. Towards the issue of lexical fossilization and colloquialism of Chinese students, the author develops the corpus assisting writing/examing module to help them write articles by querying corpus. This module not only extracts collocations but also analyzes their richness and verifies them. It helps users examine and improve their use of collocation.

Test shows that the result reconstruction validly boost users’ interest in reading the corpus query result. The assisted sentence reading module was widely welcomed by the subjects when tested on understanding the long difficult sentences. The assisted writing module improves the use of collocation in articles. On average, 10 revised articles graded on Pigaiwang (a website provides grading module) are elevated by 1.9 points (full credit 100) after using the system for modification. Most points elevated is 5.

分类号:

 TP3    

论文总页数:

 106    

参考文献总数:

 86    

参考文献列表:
冯展极, 周萍, 张丽杰著. 方法论视域下的英语教学新探[M].2017.
牛洁珍. 基于现代信息技术的大学生英语写作能力培养研究[M].2016.
刘荣君,张虹,王娜. 信息技术支持的大学生英语写作能力培养的实证研究[J].电化教育研究,2014,35(05):82-86+113.
蒋学清,蔡静,唐锦兰. 探析自动作文评分系统对大学生英语写作能力发展的影响[J].山东外语教学,2011,(6):36-43.
笪冬梅. 大学英语英语写作错误分析[J].现代经济信息,2018(19):407+409.
高文文. 大学英语写作中系统错误的识别和纠正[D].西安外国语大学,2018.
梁彪. 面向英语智能学习的知识库系统的设计与实现[D].北京大学.2018.
赵晓平,王巧宁. 在线语料库对中国英语学习者写作发展的影响[J].语言教育,2018,6(04):16-22.
林立,董启明. 语言学与应用语言学研究[M].2005.
刘喜琴. 语料库辅助EFL自主学习的多维探索[M].2013.
朱越峰. 英语教育词汇学[M].2015.
许智坚. 计算机辅助英语教学[M].2015.
万丽芳. 中国英语专业大学生二语写作中的词汇丰富性研究[J].外语界,2010(1):40-46.
张金福. 基于美国当代英语语料库对中国学生英语作文中词汇应用能力研究[D].上海外国语大学,2012.
郑丽洁.小文本语料库在Hadoop平台上的存储策略研究[D].华中师范大学,2014.
郑通涛,曾小燕. 大数据时代的汉语中介语语料库建设[J].厦门大学学报(哲学社会科学版),2016(02):53-63.
荀恩东,饶高琦,肖晓悦,臧娇娇. 大数据背景下BCC语料库的研制[J].语料库语言学,2016,3(01):93-109+118.
曾用强. 基于语料库的适应性学习模式[J].现代外语,2001(03):268-275+267.
宋丽珏. 人工智能时代语料库短语学考察[J].学习与探索,2017(12):78-85.
牛桂玲. 中外学术论文中英文摘要语料库的创建及应用[M].2013.
冯正斌,王峰. 财经英语新闻语料库的建设构想与教学应用[J].外语电化教学,2016(02):54-58+39.
张钰莎. 微博主题语料库的设计与实现[J].情报探索,2016(10):65-67.
邓军涛,古煜奎. 口译自主学习语料库建设研究[J].外文研究,2017,5(04):88-93+106-107.
杨林伟. 数字时代下的计算机辅助语言教学 理论与实践[M].2015.
郑晶,欧琛. 译学发展与流派研究[M].2015.
王余光,徐雁. 中国阅读大辞典[M].2016.
彭聃龄,张必隐. 认知心理学[M].杭州:浙江教育出版社.2004.
梁宁建. 当代认知心理学[M].2014.
潘云燕,赵天红,张馨编. 迎战710分大学英语四级考试阅读理解突破[M].2007.
孟遥,李生,赵铁军,et al. 基于统计的句法分析技术综述[J].计算机科学,2003,30(9).
徐润华. 基于词语搭配知识和语法功能匹配的句法分析器[D].南京师范大学.2013.
项炜,金澎. 大规模语料库上的Stanford和Berkeley句法分析器性能对比分析[J].电脑知识与技术,2013(8):1984-1986.
马刚. 基于语义的Web数据挖掘[M].大连:东北财经大学出版社.2014:273.
高彦杰,倪亚宇著. Spark大数据分析实战[M].北京:机械工业出版社.2016.
胡吉明. 社会网络环境下基于用户关系的信息推荐服务研究[M].武汉:武汉大学出版社.2015:156-160.
吴思远,蔡建永,于东,江新. 文本可读性的自动分析研究综述[J].中文信息学报,2018,32(12):1-10.
谭文堂. 基于统计模型的汉语句子主干分析[D].国防科学技术大学,2008.
郭艳华,周昌乐. 一种汉语语句依存关系网分析策略与生成算法研究[J].浙大学学报(理学版), 2000,27(6):637-646.
齐浩亮,杨沐昀,孟遥,韩习武,赵铁军. 面向特定领域的汉语句法主干分析[J].中文信息学报,2004,18(1):01-06.
薛永增,杨沐昀,赵铁军,韩习武,齐浩亮. 面向体育领域的句子主干翻译技术研究[J].中文信息学报,2005,19(5):24~31.
许威,赵克,亿珍珍. 一个确定汉语句子主干的递归模型[J].航空计算技术,2008,38(4):66~70.
刘梅彦,黄改娟. 面向信息内容安全的文本过滤模型研究[J].中文信息学报,2017,31(02):126-131+138.
刘绍毓,李弼程,郭志刚,王波,陈刚. 实体关系抽取研究综述[J].信息工程大学学报,2016,17(05):541-547.
鄂海红,张文静,肖思琪,程瑞,胡莺夕,周筱松,牛佩晴. 深度学习实体关系抽取研究综述[J/OL].软件学报:1-28[2019-05-02].https://doi.org/10.13328/j.cnki.jos.005817.
张传岩. Web实体活动与实体关系抽取研究[D].济南:山东大学硕士学位论文,2012.
王敏. 基于多代理策略的中文实体关系抽取[D].大连:大连理工大学硕士学位论文,2011.
陈锦瑞,姬东鸿. 基于图的半监督关系抽取[J].软件学报,2008,19(11):2843-2852.
邓耀臣,王同顺. 词语搭配抽取的统计方法及计算机实现[J].外语电化教学,2005(05):26-29.
熊文新. 语言资源视角下的语料库建设与应用研究 汉、英[M].北京:外语教学与研究出版社.2015.
章成志. 多语言领域本体学习研究[M].南京:南京大学出版社.2012:85-91.
(美)酷奇,(美)罗戈任斯基. 深入理解ElasticSearch[M].北京:机械工业出版社.2016:1-4.
周立. SpringCloud与Docker微服务架构实战[M].北京:电子工业出版社.2017.
邓耀臣,肖德法. 中国大学生英语虚化动词搭配型式研究[J].外语与外语教学,2005(7):7-10.
王立非,张岩.大学生英语议论文中高频动词使用的语料库研究[J].外语教学与研究,2007(02):110-116+160-161.
许家家. 学生的写作错误与写作指导[J].课程教育研究,2018(48):83.
唐锦兰,吴一安. 在线英语写作自动评价系统应用研究述评[J].外语教学与研究,2011,(2):273-282.
杨惠中. 语料库语言学的应用研究与贡献[J].现代外语,2010(04).
吴伟成,周俊生,曲维光. 基于统计学习模型的句法分析方法综述.2013.
杨涵舒. 技术说明书的易读性研究[D].北京大学:2013
许家金. 多语种在线语料库检索平台BFSU CQPweb使用简明手册.
许家金,吴良平. 基于网络的第四代语料库分析工具CQPweb及应用实例[J].外语电化教学,2014(05):10-15+56.
梁茂成, 李文中,许家金.《语料库应用教程》.北京:外语教学与研究出版社,2010.
于娜娜. 基于B/S架构的语料库管理系统[D].哈尔滨理工大学.2017.
张乐,刘芹. 中国理工科大学生英语写作语料库的设计、构建与前景.当代外语研究.2017(03):80-83.
葛晓华. Sketch Engine的核心功能和应用前景[J].外语电化教学,2017(04):23-30.
Black, P. & D. Wiliam. Assessment and classroom learning[J]. Assessment in Education, 1998,5(1):7-74.
Laufer, B. & Nation, P. Vocabulary Size and Use: Lexical Richness in L2 Written Production[J]. Applied Linguistics, 1995(16):307-322.
Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[J]. 2016:427-431.
fastText. https://fasttext.cc/docs/en/support.html. FacebookResearch.
Chinchor N, Marsh E. Muc-7 Information Extraction Task Definition[C] // Proceeding of the Seventh Message Understanding Conference(MUC-7). 1998:359-367.
Brin S. Extracting patterns and relations from the world wide web[M]. Berlin: Springer Heidelberg, 1999:172-183.
Sinclair, J. Corpus, Concordance, Collocation[M]. Oxford: Oxford University Press, 1991. 上海:上海外语教育出版社,1999.
Jones S, Sinclair J. English lexical collocations: A study in computational linguistics[J]. Cahiers-de-Lexicologie. 1974, Vol.24(1), 15-61.
Choueka, Y., Klein, T. and Neuwitz. E. Automatic Retrieval Of Frequent Idiomatic and Collocational Expressions in A Large Corpus[J]. Journal for Literary and Linguistic Computing. 1983(4):34-38.
Church, Kenneth Ward, Patrick Hanks. Word Association Norms, Mutual Information and Lexicography[J]. Computational Linguistics, 1990,Vol.16(1):22-29.
Smadja, F. Retrieving Collocation from Text: XTract[J]. Computational Linguistic, 1993,Vol.19(1).
Lin Dekang. A dependency-based method for evaluating broad-coverage parsers[J]. Natural Language Engineering, 1998,4(2):97-114.
Adam Kilgarriff, Vít Baisa, Jan Bušta, Miloš Jakubíček, Vojtěch Kovář, Jan Michelfeit, Pavel Rychlý,Vít Suchomel. The Sketch Engine: ten years on[J]. Lexicography, 2014,1(1).
Barono, M. & S. Bernardini. BootCaT: Bootstrapping Corpora and Terms from the Web. Proceedings of 4th International Conterence on Language Resources and Evaluation. 2004:1313-1316.
McEnery, T & Hardie.A Corpus Linguistics: Method, Theory and Practice[M].Cambridge: Cambridge University Press. 2012.
ACE 2005. The Automatic Content Extraction (ACE) Projects [EB/OL][2007-01-11].http://www.ldc.upenn.edu/Projects/ACE/.
http://www.bfsu-corpus.org/static/corpus_tools/CQPweb_guide.pdf
Christopher D. Manning, Mihai Surdeanu, John Bauer. The Stanford CoreNLP Natural Language Processing Toolkit. 2014.
British National Corpus, https://corpus.byu.edu/bnc/, Oxford University.
https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/tregex/TregexPattern.html
AG's corpus of news articles, http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html
公开日期:

 2019-06-12    

基于文献的中医经方靶点预测关键技术研究.张琢

链接

题名:

 基于文献的中医经方靶点预测关键技术研究    

姓名:

 张琢    

学号:

 1601210876    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 刘耀    

导师1单位:

 中国科学技术信息研究所    

论文答辩日期:

 2019-05-29    

关键词:

 基于文献 中医经方 成分提取 靶点预测    

论文摘要:

经方是中医经典方剂的略称,靶点是药物发挥临床疗效时在人体内发生作用的结合位点。纵观国内外文献,基于经方进行靶点研究的文献并不多。究其原因,中医经方在国际上的影响力有限,国外学者受制于语言限制,以经方为单位进行靶点研究文献寥寥可数。国内学者虽然有天然的语言优势,但是受制于诊疗理念的不同,以经方为单位研究靶点的文献数量不多。而且,以经方为单位进行靶点研究,要人工检索文献,确定古籍中经方的配方,以及书籍和期刊论文中的经方成分,恰当处理同义词,将成分翻译成恰当的英文,再从靶点数据库获取对应的靶点,然后才能开始进行实验研究。部分经方配方有多个版本,每个配方有多种药材,药材又有众多成分,成分又对应着不同的靶点,且这些术语在不同文献中表达方式也不统一,可见科研人员在进行真正的医学实验之前,要进行大量的文献调研工作。所以,本文旨在从文献中发现经方的靶点,给科研人员提供一定的参考,减轻文献调研的压力。

从文献中发现经方的靶点,最简单的方式就是直接找到研究经方靶点的文献,提取信息。但是,由于这类文献较少,所以本文构建了从经方到靶点的不同路径,建立从经方到靶点之间的联系。经方由药材构成,药材又由成分组成,成分和靶点在靶点数据库中可以建立直接联系。因此,本文先提取经方名称、经方配方和经方成分,然后进行靶点提取,再借助靶点数据库,构建靶点筛选模型,最后将不同来源的靶点汇总,计算靶点置信度,给出经方靶点预测的列表。

本文将经方靶点预测的业务流程分解为经方配方提取、经方成分提取和经方靶点获取和经方靶点预测四个模块,并围绕这四个模块展开了关键技术的研究。

在配方提取上,本文实现了通过解析多本经典书籍提取经方配方,自动化对比结果,并给出相对一致的经方配方,解决了多版本文献资源提取关键信息的问题,这部分源于业务却不拘泥于业务本身,相反是通过简单的业务流程,来验证技术的可行性,为处理类似问题提供一个解决方法。

在成分提取上,本文通过期刊影响因子和是否为核心期刊对期刊质量进行分类,以经典书籍与优质期刊论文为主,一般文献作为辅助,用规则和统计结合的方式,提取成分术语。在成分提取模型中,本文先用通用分词工具切词,筛选关键句,再使用最大逆匹配对关键句重新切词分词,保证了包含特殊符号的成分术语可以被切分并提取出来。同时,本文还引入了添加了规则的Bi-gram模型,并通过计算词频、互信息和信息熵来发现来新的成分术语,减轻了切分词工具对初始分词词典的依赖。

在靶点获取上,文本提出了两种方式。第一种方式是通过正则表达式直接从文献中提取靶点,先直接提取经方的靶点,再提取其配方药材,然后提取经方成分的靶点。提取方式基本相同,但是三者与经方靶点的相关性依次递减,所以对应的靶点置信度也以相应减小。另一种是先借助靶点数据库获取经方成分的靶点,再通过对成分和靶点共现文献进行分类,来实现对靶点的筛选。本文先从经典书籍中获取成分的英文翻译,再从Drugbank靶点数据库中获取对应成分的靶点。我们认为如果成分和靶点的关系是可信的,那么二者的共现文献中就一定有合适的文献参考。也就是说,我们通过这种方式,将靶点筛选问题转化为经方成分和靶点共现文献分类问题。所以,我们获取了经方和靶点共现的文献,并对文献进行人工标注,把可以佐证靶点的文献作为靶点依据文献,将是否为靶点依据文献看成一个二分类问题,并将是否具有靶点依据文献作为靶点筛选的初步依据。我们选取了几种经典的文本特征,用朴素贝叶斯、KNN和SVM分类器分别进行特征寻优对比实验,最终确定了用信息增益和卡方检验结合作为特征,用SVM分类模型进行靶点依据文献分类,借此实现了靶点的初步筛选。

最后,本文构建了综合的经方靶点预测模型,提出了经方靶点置信度评分模型,根据靶点的来源、靶点依据文献的数量和相关度对靶点进行评分。本文还用构建的预测模型进行了综合实验,先以芍药甘草汤为例进行阈值寻优实验,再以大黄黄连泻心汤和四逆汤为例,验证预测模型的有效性。

分类号:

 TP3    

论文总页数:

 71    

参考文献总数:

 80    

参考文献列表:
[1] 王海英, 刘旭东, 王好良. 经方是中药开发的源泉[J]. 中国药房, 2007, 18(21):1675-1677..
[2] Yan D . [Investigation on pattern and methods of quality control for Chinese materia medica based on dao-di herbs and bioassay - bioassay for Coptis chinensis].[J]. Acta Pharmaceutica Sinica, 2011, 46(5):568.
[3] 朱燕波, 王琦, 折笠秀树. 日本汉方循证医学研究的困难性、现状及其对策[J]. 中华中医药杂志, 2004, 19(9):548-550.
[4] 黄熙. 方剂体内/血清成分谱与靶成分概念的提出及意义[J]. 医学争鸣, 1999, 20(4):277-279.
[5] 任平, 黄熙. 新概念药物的源泉之一:方剂血清靶成分[J]. 中草药, 2000, 31(8):637-638.
[6] Hopkins, Andrew L. Network pharmacology: the next paradigm in drug discovery[J]. Nature Chemical Biology, 2008, 4(11):682-690.
[7] 屠呦呦. 我有一个希望[J]. 中国科技奖励, 2015(10):6-8.
[8] 阎琪,张瑞彬,张海洋, 等.《伤寒论》版本研究概述[J].长春中医药大学学报,2015,31(3):635-637.
[9] 林大勇,王树鹏,傅海燕, 等.3种不同版本的翻刻宋版《伤寒论》比较研究[J].吉林中医药 ,2011,(2).
[10]张杰.小八角莲活性成分提取分离、质量控制及药效研究[D].湖南:中南大学,2010. DOI:10.7666/d.y1918428.
[11]Lin Y , Mehta S , Kü?ük-McGinty, Hande, et al. Drug target ontology to classify and integrate drug discovery data[J]. Journal of Biomedical Semantics, 2017, 8(1):50.
[12]Drugbank: https://www.drugbank.ca
[13]Soufan O , Ba-Alawi W , Afeef M , et al. DRABAL: novel method to mine large high-throughput screening assays using Bayesian active learning[J]. Journal of Cheminformatics, 2016, 8(1):64.
[14]Kiyatkin EA, Brown PL: The role of peripheral and central sodium channels in mediating brain temperature fluctuations induced by intravenous cocaine. Brain Res. 2006 Oct 30;1117(1):38-53. Epub 2006 Sep 7
[15]Lin Y , Mehta S , Kü?ük-McGinty, Hande, et al. Drug target ontology to classify and integrate drug discovery data[J]. Journal of Biomedical Semantics, 2017, 8(1):50.
[16]姚新生,胡柯.中药复方的现代化研究[J].化学进展,1999,19(2):192-196. DOI:10.3321/j.issn:1005-281X.1999.02.012.
[17]陈仁寿.中医内科学术流派历史沿革述略[C]南京中医药大学,2014:20-23.
[18]程林顺,杨静,王艳桥.中医药文化在中华传统文化中的哲学意蕴及价值拓展[J].中国卫生事业管理,2018,35(9):717-720.
[19]马壮,闫风.中药有效成分研究现状分析[J].长春中医药大学学报,2008(04):380-381.
[20]闫良,张佳丽,田鑫.核受体调控中药-药物相互作用的研究现状[J].中国临床药理学杂志,2019,35(02):184-187.
[21]杨世磊,刘克辛.药物转运体介导的中药及单体药物相互作用的研究进展[J].药物评价研究,2019,42(01):194-203.
[22]祝婉芳.从经方论中药配伍与疗效[J].长春中医药大学学报,2014,30(5):829-830. DOI:10.13463/j.cnki.cczyy.2014.05.026.
[23]陈竺.系统生物学--21世纪医学和生物学发展的核心驱动力[J].世界科学,2005,(3):2-6.
[24]Hopkins A L. Network pharmacology. [J]. Nature Biotechnology, 2007, 25(10):1110-1111.
[25]刘艾林, 杜冠华. 网络药理学:药物发现的新思想[J]. 药学学报, 2010(12):1472-1477.
[26]胡亚洁,赵晓锦,宋咏梅,付先军.基于网络药理学的中药复方研究探讨[J].时珍国医国药,2018,29(06):1400-1402.
[27]刘志华, 孙晓波. 网络药理学:中医药现代化的新机遇[J]. 药学学报, 2012(6):696-703.
[28]张贵彪,陈启龙,苏式兵.中药网络药理学研究进展[J].中国中医药信息杂志,2013,(8):103-106. DOI:10.3969/j.issn.1005-5304.2013.08.049.
[29]周文霞.网络药理学的研究进展和发展前景[J].中国药理学与毒理学杂志,2015,(5):760-762. DOI:10.3867/j.issn.1000-3002.2015.05.051.
[30]李翔,吴磊宏,范骁辉, 等.复方丹参方主要活性成分网络药理学研究[J].中国中药杂志,2011,36(21):2911-2915. DOI:10.4268/cjcmm20112102.
[31]许海玉,刘振明,付岩, 等.中药整合药理学计算平台的开发与应用[J].中国中药杂志,2017,42(18):3633-3638.
[32]朱艳芳, 徐志伟, 敖海清, et al. 调脾护心方的计算机网络药理学研究[J]. 中药新药与临床药理, 2012(1):25-29.
[33]李梢.网络靶标:中药方剂网络药理学研究的一个切入点[J].中国中药杂志,2011,36(15):201.
[34]邓小敏,郭超峰.网络药理学背景下的中药药效机制及疗效评价研究[J].医学与哲学,2012,33(19):67-68.;
[35]Li S , Zhang B . Traditional Chinese medicine network pharmacology: theory, methodology and application[J]. Chinese Journal of Natural Medicines, 2013, 11(2):110-120..
[36]Wu X M , Wu C F . Network pharmacology: A new approach to unveiling Traditional Chinese Medicine[J]. Chinese Journal of Natural Medicines, 2015, 13(1):1-2..
[37]张彦琼,李梢.网络药理学与中医药现代研究的若干进展[J].中国药理学与毒理学杂志,2015,(6):883-892.
[39]M Nair, Shyama. A Survey on Drug-Target Interaction Prediction Methods Analysis of Prediction Mechanisms for Drug Target Discovery [J]. International Journal for Research in Applied Science & Engineering Technology,2018,6(3): 363-368.
[40]王克强, 吴宏宇, 李国栋, 黄青山. BCTD:一个药物重定位研究用药物靶点数据库[J]. 计算生物学, 2015, 5(3): 41-47.
[41]Kuhn M , Mering C V , Campillos M , et al. STITCH: Interaction networks of chemicals and proteins[J]. Nucleic Acids Research, 2008, 36(Database issue):D684-8.
[42]李芸. 三拗汤古今文献整理及效应机制挖掘研究[D].南京中医药大学,2014.
[43] Ru J , Li P , Wang J , et al. TCMSP: a database of systems pharmacology for drug discovery from herbal medicines[J]. Journal of Cheminformatics, 2014, 6(1):13.
[44]史海龙,赵云飞,惠媛,王瑞辉,郭新荣.基于药物靶点从传统中药库中高通量虚拟筛选EGFR-TK抑制剂[J].时珍国医国药,2016,27(09):2300-2304.
[45]史海龙,王玉成,樊莹莹,龚佳鑫,郭新荣.基于药物靶点从传统中药库中高通量虚拟筛选HIV-1整合酶抑制剂[J].中国实验方剂学杂志,2016,22(19):159-16
[46]吴磊宏,高秀梅,王林丽, 等.附子多成分作用靶点预测及网络药理学研究[J].中国中药杂志,2011,36(21):2907-2910. DOI:10.4268/cjcmm20112101.
[47]叶蕾. 基于系统药理学的四君子汤作用靶点预测及实验研究[D]. 山东中医药大学, 2015.
[48]汝锦龙. 中药系统药理学数据库和分析平台的构建和应用[D].
[49]刘洪. 三拗汤及其加味方功效物质靶点网络构建及干预PM2.5诱导加重哮喘的研究[D].南京中医药大学,2017.
[50] Zhang A , Sun H , Qiu S , et al. Advancing Drug Discovery and Development from Active Constituents of Yinchenhao Tang, a Famous Traditional Chinese Medicine Formula[J]. Evidence-Based Complementary and Alternative Medicine, 2013, 2013:1-6.
[51] Chen S , Jiang H , Cao Y , et al. Drug target identification using network analysis: Taking active components in Sini decoction as an example[J]. Scientific Reports, 2016, 6:24245.
[52]王涛,邹文俊,张璐, 等.基于网络药理学预测四逆汤抗心力衰竭的作用机制[J].中国医院用药评价与分析,2017,17(10):1304-1306. DOI:10.14009/j.issn.1672-2124.2017.10.003..
[53]唐策,文检,杨娟, 等.藏药翼首草抗类风湿性关节炎活性成分靶点的网络药理学研究[J].中国药房,2017,(19):2666-2670. DOI:10.6039/j.issn.1001-0408.2017.19.21.
[54]王腾宇,白根本.基于生物信息学预测蒲公英干预炎症的"药效网络"及机制研究[J].江苏中医药,2018,(2):73-75..
[55]张文娟, 王永华. 系统药理学原理、方法及在中医药中的应用[J]. 世界中医药, 2015(2):280-286.
[56]闫效莺, 康磊, 李润洲. 基于异构网络的标签传播算法预测药物靶点关系[J]. 计算机应用研究, 2017(4).
[57]Cannon DC, Yang JJ, Mathias SL, et al. TIN-X: target importance and novelty explorer. Bioinformatics. 2017;33(16):2601-2603.
[58]薛为民,陆玉昌.文本挖掘技术研究[J].北京联合大学学报(自然科学版),2005(04):59-63.
[59]Yanhong L , Anmeng S , Jingling W . A Survey of Current Work in Medical Text Mining---Data Source Perspective[J]. 2017.
[60]齐彬,吕婷.共现分析技术在生物医学信息文本数据挖掘中的应用[J].中华医学图书情报杂志,2009,18(03):41-43.
[61]周雪忠.文本挖掘在中医药中的若干应用研究[D].浙江大学,2004.
[62]袁毅,张丹,张晓东,谢建明,孙啸.基因相关生物医学文献挖掘研究[J].电脑知识与技术,2008(13):620-623+677
[63]王春锋. 基于整合文本挖掘方法的中医证与分子生物学知识的关联分析系统[D].北京交通大学,2008.
[64]查青林,余俊英,余飞,郑光,郭洪涛,吕爱平,于峥,姜淼.基于代谢相关MeSH词文本挖掘分析治疗咳嗽中药五味分类的生物学特征[J].中国中医基础医学杂志,2010,16(07):616-618.
[65]谭勇,郭洪涛,郑光,张弛,杨静,吕诚,查青林,姜淼,吕爱平.利用文本挖掘技术探索中医药治疗疾病的用药规律[J].世界科学技术(中医药现代化),2010,12(05):823-827.
[66]梁非,展俊平,李立,郑光,吕爱平,姜淼,喻长远.基于文本挖掘方法探索寒性热性中药的病证方药相应规律[J].中国实验方剂学杂志,2013,19(15):333-337.
[67]于彤,朱玲,李敬华, 等.中医文本信息抽取系统[J].中国医学创新,2015,(21):108-109,110.
[68]Ravikumar K E , Wagholikar K B , Li D , et al. Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature[J]. BMC Bioinformatics, 2015, 16(1):185.
[69]Rios A , Kavuluru R . Convolutional neural networks for biomedical text classification: application in indexing biomedical articles[J]. Acm Bcb, 2015, 2015:258-267.
[70]Al-Aamri A , Taha K , Al-Hammadi Y , et al. Constructing Genetic Networks using Biomedical Literature and Rare Event Classification[J]. Scientific Reports, 2017, 7(1):15784.
[71]Ravikumar K E , Rastegar-Mojarad M , Liu H . BELMiner: adapting a rule-based relation extraction system to extract biological expression language statements from bio-medical literature evidence sentences[J]. Database, 2017, 2017(1).
[72]Habibi M , Weber L , Neves M , et al. Deep learning with word embeddings improves biomedical named entity recognition[J]. Bioinformatics, 2017, 33(14):i37-i48.
[73]何远标, 乐小虬, 张帆. 学术论文大纲中关键术语抽取方法研究[J]. 数据分析与知识发现, 2014, 30(3).
[74]程薇. 基于成分术语提取的药品相互作用检测研究[D]. 安徽理工大学, 2015.
[75]殷亚博,杨文忠,杨慧婷, 等.基于卷积神经网络和KNN的短文本分类算法研究[J].计算机工程,2018,44(7):193-198. DOI:10.19678/j.issn.1000-3428.0047596.
[76]Yan Rui,Cao Xianbin,Li Kai.Dynamic assembly classifica—tion algorithm for short text[J].Acta Electronica Sinica,2009,37(5):1019—1024.(in Chinese)
[77]Maron M E, Kuhns J L. On Relevance, Probabilistic Indexing and Information Retrieval[J]. Journal of the Acm. 1960, 7(3):216-244.
[78]黄贤英,熊李媛,刘英涛, 等.基于类别特征改进的KNN短文本分类算法[J].计算机工程与科学,2018,40(1):148-154. DOI:10.3969/j.issn.1007-130X.2018.01.022.
[79]宋爽.基于特征扩展的短文本分类[D].辽宁:大连理工大学,2018
[80]沈加. 基于SVM模型的新闻分类系统设计与实现[D].电子科技大学,2013.
公开日期:

 2019-06-13    

基于网络表示学习的科技简报自动生成关键技术研究.张越

链接

题名:

 基于网络表示学习的科技简报自动生成关键技术研究    

姓名:

 张越    

学号:

 1601210872    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 刘耀    

导师1单位:

  软件与微电子学院    

论文答辩日期:

 2019-05-29    

关键词:

 科技简报 概念关系标引 知识网络 网络表示学习 科技简报生成    

论文摘要:

科技简报是科技情报类公文中一个重要的文种,能够为各级决策机关制定科技政策提供参考。但随着“大数据时代”的到来,传统的科技情报内涵、组织模式与应用发生了不同形式的变化,如何从海量的科技报告数据源中提炼出各类重要信息是当前科研人员所面临的重要挑战之一。目前已有的科技简报自动生成系统的核心技术是文本生成,对科技简报的文本特点不具针对性,从而对科技报告的资源使用率不高。另外,现有的文本生成技术不能有效地体现科技政策语料中的丰富知识网络信息。根据以上情况,本文分析并实验前人的研究成果,主要将历年来科技研究所公布的科技简报与其原科技政策文本进行对比分析,通过回溯的方法研究如何从冗长的科技政策文本生成简短的科技简报。其中涉及的关键技术包括概念关系标引,知识网络构建,网络表示学习以及文本生成等,本文最后运用这些技术实现了科技简报自动生成的功能。

本文首先对科技政策类文本的文本特征进行分析,分别依照概念与概念间关系的分类体系,对科技政策文本中的概念与概念关系进行自动标引。其中,概念标引采用基于RNN+CRF的深度学习方法,实现在句子中自动识别概念词汇并添加类别标签。关系标引主要分析概念间的关系特征,并采用基于SVM主动学习分类方法,为概念实体对自动标引关系。实验表明本文使用的方法能够有效地标引科技政策文本中的概念词并预测概念词之间的关系。

在实现科技政策文本自动标引后,本文进一步研究了如何通过概念与概念关系构建知识网络,并分别提出了概念知识网络和融合篇章结构的知识网络构建方法。基于此知识网络模型,本文采用了一种能够融合节点语义、拓扑结构以及类别标签信息的网络表示学习模型,并引入Node2vec算法和知识推理信息对该模型进行改进。同时,本文还对带篇章结构的知识网络中的篇章节点表示进行分析。最后通过SVM分类器和可视化方式,证明本文提出的网络表示学习方法能更加有效地表示知识网络中的节点。

最后,本文将知识网络节点表示分别应用在基于单篇的和基于多篇的科技简报篇章结构和内容的自动生成中。对于科技简报篇章结构的生成,本文采用保留原文结构或者选取重要篇章节点的方法。对于文本内容生成,本文研究了抽取式和生成式两种方式。本文最后针对科技简报文本写作特点,完成科技简报的自动生成功能。

外文摘要:

The scientific briefing is a kind of important document in the scientific and technological information, which can provide the reference for the decision makers at all levels to formulate science and technology policies. However, with the advent of the big data era, the traditional scientific and technological information connotation, organizational model and application have undergone different forms of change. How to extract all kinds of valuable information from the vast amount of scientific and technological report data is the challenge that the current researchers are facing. At present, the core of the automatic generation system is text generation, which ignores the text characteristics of the scientific briefing, so the resource utilization rate of the scientific report is not high. Also, the existing text generation technology cannot adequately reflect the rich knowledge network information in the science and technology policy corpus. Based on the above situation, this paper analyzes and experiments the research results of the predecessors, mainly comparing the scientific and scientific briefings published by the Science and Technology Research Institute over the past years with the original scientific and technological policy texts, and researching how to generate short technology from the lengthy scientific and technological policy texts. The analytical techniques include conceptual and relationship labeling, knowledge network construction, network representation learning and text generation. At the end of the paper, these technologies are used to achieve the automatic generation of scientific briefings function.

This paper first analyzes the textual characteristics of science and technology policy categories and automatically labels the concepts and conceptual relationships in the science and technology policy texts according to the classification category of the relationship among concepts. Among them, the concept labeling adopts deep learning method based on RNN+CRF, which realizes the automatic recognition of the concept in the sentence and adds the category label. The relationship labeling mainly analyzes the relationship characteristics between concepts and adopts the SVM active learning classification method to label the relationship between the unmarked concept entity pairs. Experiments show that the technique used in this paper can effectively recognize the concept words in the science and technology policy text and predict the relationship label between concept words.

After realizing the automatic labeling of science and technology policy texts, this paper further studies how to construct a knowledge network through the relationship between concepts. Based on this knowledge network model, the paper adopts a network representation learning basic model that can fuse node semantics, topology structure, and category label information, and also introduces Node2vec algorithm and knowledge representation learning algorithms to improve the basic model. At the same time, this paper analyzes the node representation in the knowledge network with chapter structure. Finally, through the SVM classifier and visualization method, it is proved that the network representation learning method proposed in this paper can more effectively represent the nodes in the knowledge network.

Finally, this paper applies the knowledge network node representation to the automatic generation of chapter structure and content of single-article oriented and multi-article based scientific briefing. For the generation of the scientific briefing text structure, this paper adopts the method of retaining the original text structure or selecting important chapter nodes. For text content generation, this paper studies the two methods of extractive and abstractive generation. Then, this paper focuses on the characteristics of the sicientific briefing text writing and completes the automatic generation of the scientific briefing.

分类号:

 TP3    

论文总页数:

 80    

参考文献总数:

 55    

参考文献列表:
[1] 李念峰. 基于自动摘要的网络情报收集系统研究[J]. 现代情报, 2007, 27(11):161-163.
[2] 尹显贵. 基于Web的企业竞争情报服务平台中多文本摘要技术研究[D]. 昆明理工大学, 2012.
[3] 孟凡坤. 特定领域知识库的构建与简报生成[D]. 北京工业大学, 2014.
[4] 张晓艳, 王挺, 陈火旺. 命名实体识别研究[J]. 计算机科学, 2005, 32(4):44-48.
[5] Collins M, Singer Y. Unsupervised models for named entity classification[C]//1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. 1999.
[6] Bikel D M, Schwartz R, Weischedel R M. An algorithm that learns what's in a name[J]. Machine learning, 1999, 34(1-3): 211-231.
[7] Curran J, Clark S. Language independent NER using a maximum entropy tagger[C]//Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003. 2003.
[8] McNamee P, Mayfield J. Entity extraction without language-specific resources[C]//proceedings of the 6th conference on Natural language learning-Volume 20. Association for Computational Linguistics, 2002: 1-4.
[9] McCallum A, Li W. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons[C]//Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics, 2003: 188-191.
[10] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J]. Journal of machine learning research, 2011, 12(Aug): 2493-2537.
[11] Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015.
[12] Pham T H, Le-Hong P. End-to-end recurrent neural network models for vietnamese named entity recognition: Word-level vs. character-level[C]//International Conference of the Pacific Association for Computational Linguistics. Springer, Singapore, 2017: 219-232.
[13] Ma X, Hovy E. End-to-end sequence labeling via bi-directional lstm-cnns-crf[J]. arXiv preprint arXiv:1603.01354, 2016.
[14] Miller S, Fox H, Ramshaw L, et al. A novel use of statistical parsing to extract information from text[C]//1st Meeting of the North American Chapter of the Association for Computational Linguistics. 2000.
[15] Mintz M, Bills S, Snow R, et al. Distant supervision for relation extraction without labeled data[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Association for Computational Linguistics, 2009: 1003-1011.
[16] Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction[J]. Journal of machine learning research, 2003, 3(Feb): 1083-1106.
[17] Brin S. Extracting patterns and relations from the world wide web[C]//International workshop on the world wide web and databases. Springer, Berlin, Heidelberg, 1998: 172-183.
[18] Hasegawa T, Sekine S, Grishman R. Discovering relations among named entities from large corpora[C]//Proceedings of the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics, 2004: 415.
[19] Piasecki M, Ramocki R, Kaliński M. Information spreading in expanding wordnet hypernymy structure[C]//Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013. 2013: 553-561.
[20] Gonzalez J E, Xin R S, Dave A, et al. Graphx: Graph processing in a distributed dataflow framework[C]//11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14). 2014: 599-613.
[21] Low Y, Bickson D, Gonzalez J, et al. Distributed GraphLab: a framework for machine learning and data mining in the cloud[J]. Proceedings of the VLDB Endowment, 2012, 5(8): 716-727.
[22] Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014: 701-710.
[23] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Advances in neural information processing systems. 2013: 3111-3119.
[24] 涂存超, 杨成, 刘知远,等. 网络表示学习综述[J]. 中国科学:信息科学, 2017(8):32-48.
[25] Tang J, Qu M, Wang M, et al. Line: Large-scale information network embedding[C]//Proceedings of the 24th international conference on world wide web. International World Wide Web Conferences Steering Committee, 2015: 1067-1077.

[26] Grover A, Leskovec J. node2vec: Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2016: 855-864.

[27] Wang D, Cui P, Zhu W. Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2016: 1225-1234.
[28] Yang C, Liu Z, Zhao D, et al. Network representation learning with rich text information[C]//Twenty-Fourth International Joint Conference on Artificial Intelligence. 2015.

[29] Tu C, Liu H, Liu Z, et al. Cane: Context-aware network embedding for relation modeling[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017: 1722-1731.

[30] Tu C, Zhang Z, Liu Z, et al. TransNet: Translation-Based Network Representation Learning for Social Relation Extraction[C]//IJCAI. 2017: 2864-2870.
[31] Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data[C]//Advances in neural information processing systems. 2013: 2787-2795.
[32] 宗成庆. 统计自然语言处理(第2版)[M]// 统计自然语言处理. 2008.
[33] Carbonell J G, Goldstein J. The Use of MMR and Diversity-Based Reranking for Reodering Documents and Producing Summaries[J]. 1998.
[34] Bollegala D, Okazaki N, Ishizuka M. A bottom-up approach to sentence ordering for multi-document summarization[J]. Information processing & management, 2010, 46(1): 89-109.
[35] Li C, Qian X, Liu Y. Using supervised bigram-based ILP for extractive summarization[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2013, 1: 1004-1013.
[36] Lin H, Bilmes J. A class of submodular functions for document summarization[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 2011: 510-520.
[37] Li C, Liu Y, Liu F, et al. Improving multi-documents summarization by sentence compression based on expanded constituent parse trees[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014: 691-701.
[38] Bing L, Li P, Liao Y, et al. Abstractive multi-document summarization via phrase selection and merging[J]. arXiv preprint arXiv:1506.01597, 2015.
[39] Liu F, Flanigan J, Thomson S, et al. Toward abstractive summarization using semantic representations[J]. arXiv preprint arXiv:1805.10399, 2018.
[40] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Advances in neural information processing systems. 2014: 3104-3112.
[41] Cho K, Van Merriënboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv:1406.1078, 2014.
[42] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.
[43] Gehring J, Auli M, Grangier D, et al. A convolutional encoder model for neural machine translation[J]. arXiv preprint arXiv:1611.02344, 2016.
[44] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in neural information processing systems. 2014: 2672-2680.
[45] Le Q, Mikolov T. Distributed representations of sentences and documents[C]//International conference on machine learning. 2014: 1188-1196.
[46] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.
[47] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.
[48] 刘丹丹, 彭成, 钱龙华, et al. 词汇语义信息对中文实体关系抽取影响的比较[J]. 计算机应用, 2012, 32(08):2238-2244.
[49] 刘向, 马费成, 陈潇俊, et al. 知识网络的结构与演化——概念与理论进展[J]. 情报科学, 2011(6):801-809.
[50] Tu C, Wang H, Zeng X, et al. Community-enhanced network representation learning for network analysis[J]. arXiv preprint arXiv:1611.06645, 2016.
[51] Griffiths T L, Steyvers M. Finding scientific topics[J]. Proceedings of the National academy of Sciences, 2004, 101(suppl 1): 5228-5235.
[52] Pan S, Wu J, Zhu X, et al. Tri-party deep network representation[J]. Network, 2016, 11(9): 12.
[53] Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data[C]//Advances in neural information processing systems. 2013: 2787-2795.
[54] 李娜娜, 刘培玉, 刘文锋, et al. 基于TextRank的自动摘要优化算法[J]. 计算机应用研究, 2019(5).
[55] Pennington J, Socher R, Manning C. Glove: Global vectors for word representation[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014: 1532-1543.
公开日期:

 2019-06-04    

基于文本分析与计算的科技政策扩散关键技术研究.张丽颖

链接

题名:

 基于文本分析与计算的科技政策扩散关键技术研究    

作者:

 张丽颖    

学号:

 1601210855    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 刘耀    

导师单位:

 软件与微电子学院    

答辩日期:

 2019-05-29    

关键字(中文):

 政策扩散 科技政策 文本挖掘 文本分析    

文摘:

自改革开放以来,我国新的科技政策层出不穷,这些政策在不同层级政府、不同地区之间进行扩散,对各级政府的政策行为和科技治理水平的提高起到重要的影响,但是鲜有学者对上述扩散现象进行深入的研究。同时,现有科技政策扩散研究多以定性分析为主,侧重对政策扩散理论框架、制约因素、扩散实例展开研究,缺少对所提理论和模型的系统性验证;少数定量研究也主要是应用基本的统计分析,人工参与较多,缺少对政策内容和属性的自动挖掘,难以精确地提取扩散关系,挖掘内容变化。

基于以上情况,本文在总结前人研究的基础上,针对科技政策扩散特点,重点从结构和语义层面构建了科技政策扩散模型,引入自然语言处理领域的文本分析与计算方法,进行扩散特征的自动提取和政策扩散关系的自动挖掘。

(1)对政策领域有意义字符串发现和政策结构提取技术进行研究。首先,针对科技政策中新词术语较多且长度较长,传统分词效果难以达到分析需求的问题,本文提出了基于规则和信息熵的优化方法,实验表明该方法能有效地划分出科技政策文本中绝大部分有意义字符串。对于政策结构,本文分别提出了组织结构提取和扩展方法。首先利用政策行文特点,并结合词频和TextRank算法提取出政策的组织结构。在此基础上,本文构建了科技政策领域结构词表,并根据结构词表对政策的组织结构进行扩展,最终提取出政策的基本面。

(2)对政策扩散特征表示和政策扩散关系判定技术进行研究。首先,本文从结构和语义两方面对科技政策扩散特征进行了研究,分别提取了组织结构相关性特征、基本面同一性特征、特征词承继特征以及基于Doc2vec的文本相似性特征。在特征提取的基础上,本文选用决策树分类模型,将关系判断转化为分类问题,实现对多个特征进行一体化处理,实验表明,本文构建的多特征分类模型能有效地判定政策扩散关系。

(3)对政策扩散识别技术进行研究。首先,针对同一主题下科技政策扩散情况的分析需求,本文构建了科技政策扩散识别框架,并引入了Ranking SVM模型,融合科技政策扩散特征和文本多样化特征,对模型进行了适应化改进。之后,本文提出了基于排序评分的科技政策排序距离计算方法,寻找使扩散关系成立的最大排序距离,作为扩散识别经验值。然后用这一经验值优化识别模型,实现了检索过程中科技政策扩散对和扩散集的自动计算和输出。实验表明,本文构建的科技政策扩散识别框架能有效地提取出扩散集合,满足了用户对某一主题下的科技政策扩散关系挖掘的分析需求。

分类号:

 TP3    

论文总页数:

 63    

参考文献数:

 79    

参考文献:
[1] 张玉娟, 杨海丽, 孟潇,等. 基于计量可视化分析的科技政策研究现状[C]//北京科学技术情报学会年会--“科技情报发展助力科技创新中心建设”论坛. 2017.
[2] Lazer D, Pentland A, Adamic L, et al. Computational Social Science[J]. Science, 2009, 323(1):721-723.
[3] 裴雷, 孙建军, 周兆韬. 政策文本计算:一种新的政策文本解读方式[J]. 图书与情报, 2016(6):47-55.
[4] 朱亚鹏. 政策创新与政策扩散研究述评[J]. 武汉大学学报(哲学社会科学版), 2010(4):565-573.
[5] Rogers, E.M. Diffusion of Innovations[M]. New York: The Free Press. 1983.
[6] Lucas A. Public Policy Diffusion Research: Integrating Analytic Paradigm [J]. Science Communication. 1983 ( 4 ) :379-408.
[7] 杨代福. 西方政策创新扩散研究的最新进展[J]. 国家行政学院学报, 2016(1):122-126.
[8] Walker J L. The Diffusion of Innovations Among the American States[J]. American Political Science Review, 1969,63(3):881-893.
[9] Gray V. Innovation in the States: A Diffusion Study[J]. American Political Science Review, 1973, 67(4):1174-1185.
[10] Savage R L. Policy innovativeness as a trait of American states[J]. The Journal of Politics,1978,40(1):212-224.
[11] Savage R L. Diffusion research traditions and the spread of policy innovations in a federal system [J]. Publius:The Journal of Federalism,1985,15(4):1-28.
[12] Mohr L B. Determinants of innovation in organizations[J]. American political science review, 1969, 63(1): 111-126.
[13] Berry F S, Berry W D. State Lottery Adoptions as Policy Innovations: An Event History Analysis[J]. The American Political Science Review, 1990, 84(2):395.
[14] Berry F S, Berry W D. Tax innovation in the states: Capitalizing on political opportunity[J]. American Journal of Political Science, 1992: 715-742.
[15] Berry F S. Sizing up state policy innovation research[J]. Policy Studies Journal, 1994,22(3):42-456.
[16] Feiock R C, West J P. Testing competing explanations for policy adoption: Municipal solid waste recycling programs[J]. Political Research Quarterly, 1993, 46(2): 399-419.
[17] Mintrom M, Vergari S. Policy networks and innovation diffusion: The case of state education reforms[J]. The Journal of Politics, 1998, 60(1): 126-148.
[18] Godwin M L, Schroedel J R. Policy diffusion and strategies for promoting policy change: Evidence from California local gun control ordinances[J]. Policy Studies Journal, 2000, 28(4): 760-776.
[19] Hays S P. Patterns of reinvention: The nature of evolution during policy diffusion[J]. Policy Studies Journal, 1996, 24(4): 551-566.
[20] Mintrom M. The state-local nexus in policy innovation diffusion: The case of school choice[J]. Publius: The Journal of Federalism, 1997, 27(3): 41-60.
[21] Mooney C Z, Lee M H. The temporal diffusion of morality policy: The case of death penalty legislation in the American states[J]. Policy Studies Journal, 1999, 27(4): 766-780.
[22] Miller E A. Advancing Comparative State Policy Research: Toward Conceptual Integration and Methodological Expansion[J]. State & Local Government Review, 2004, 36(1):35-58.
[23] Strebel F. Visibility and facticity in policy diffusion: going beyond the prevailing binarity[J]. Policy Sciences, 2012, 45(4):385-398.
[24] Graham E R, Shipan C R, CraigVolden. The Communication of Ideas across Subfields in Political Science[J]. Ps Political Science & Politics, 2014, 47(2):468-476.
[25] 杨启光. 全球化进程中的国际教育政策转移[J]. 比较教育研究, 2009(12):113-117.
[26] 包海芹. 国家学科基地政策扩散研究[M]. 北京大学出版社, 2011.
[27] 郭璇. 试析全球化语境下创意产业政策的移植和扩散机制[J]. 浙江社会科学, 2015(6):82-86.
[28] 张剑,黄萃,叶选挺,等. 中国公共政策扩散的文献量化研究——以科技成果转化政策为例[J]. 中国软科学,2016 (2):145-155.
[29] 林雪霏. 政府间组织学习与政策再生产:政策扩散的微观机制——以“城市网格化管理”政策为例[J]. 公共管理学报, 2015(1):11-23.
[30] 王洪涛,魏淑艳. 地方政府信息公开制度时空演进机理及启示——基于政策扩散视角[J].东北大学学报(社会科学版),2015,17(6):600-606.
[31] 裴雷, 张奇萍, 李向举,等. 中国信息化政策扩散中的政策主题跟踪研究[J]. 图书与情报, 2016(6):63-71.
[32] 施茜, 裴雷, 邱佳青. 政策扩散时间滞后效应及其实证评测——以江浙信息化政策实践为例[J]. 图书与情报, 2016(6):56-62.
[33] 施茜, 裴雷, 李向举, 等. 信息化政策理论与实践的交互扩散研究——以江浙信息化政策样本为例[J]. 情报学报, 2016, 35(10):1081-1089.
[34] 王周宾. 新型农村养老保险试点中的政策扩散机制研究[D].2016.
[35] 王小杰. 政策扩散视角下中国铁路技术规章管理的文献量化与博弈研究[D]. 2018.
[36] 陈芳. 政策扩散、政策转移和政策趋同——基于概念、类型与发生机制的比较[J]. 厦门大学学报(哲学社会科学版), 2013(6):8-16.
[37] 刘伟. 国际公共政策的扩散机制与路径研究[J]. 世界经济与政治, 2012(4):40-58.
[38] 王浦劬, 赖先进. 中国公共政策扩散的模式与机制分析[J]. 北京大学学报(哲学社会科学版), 2013, 50(6):14-23.
[39] 周望. 政策扩散理论与中国“政策试验”研究:启示与调适[J]. 四川行政学院学报, 2012(4):43-46.
[40] Sausgruber R, Tyran J R. Are we taxing ourselves? How deliberation and experience shape voting on taxes[J]. Journal of Public Economics, 2011, 95(1):164-176.
[41] Volden C. States as Policy Laboratories: Emulating Success in the Children's Health Insurance Program[J]. American Journal of Political Science, 2010, 50(2):294-312.
[42] Garrett K N, Jansa J M. Interest group influence in policy diffusion networks[J]. State Politics & Policy Quarterly, 2015, 15(3): 387-417.
[43] Desmarais B A, Harden J J, Boehmke F J. Persistent policy pathways: Inferring diffusion networks in the American states[J]. American Political Science Review, 2015, 109(2): 392-406.
[44] Wilkerson J D. Large-scale Computerized Text Analysis in Political Science[J]. Annual Review of Political Science, 2017, 20(1):529-544.
[45] Svyatkovskiy A, Imai K, Kroeger M, et al. Large-scale text processing pipeline with Apache Spark[C]// IEEE International Conference on Big Data. 2016.
[46] Linder F, Desmarais B A, Burgess M, et al. Text as Policy: Measuring Policy Similarity Through Bill Text Reuse[J]. Social Science Electronic Publishing, 2016.
[47] Gilardi F, Shipan C R, Wueest B. Policy diffusion: The issue-definition stage[J]. University of Zurich and University of Michigan, 2018.
[48] 武学振. 中国省级政府信息政策创新扩散研究[D]. 2016.
[49] Comparative Agendas. [EB/OL]. https:.www.comparativeagendas.net/.
[50] 汪涛, 谢宁宁. 基于内容分析法的科技创新政策协同研究[J]. 技术经济, 2013, 32(9):22-28.
[51] 胡赛全, 詹正茂, 钱悦, 等. 战略性新兴产业发展的政策工具体系研究——基于政策文本的内容分析[J]. 科学管理研究, 2013, 31(3):66-69.
[52] Grimmer J, Stewart B M. Text as data: The promise and pitfalls of automatic content analysis methods for political texts[J]. Political analysis, 2013, 21(3): 267-297.
[53] 李江, 刘源浩, 黄萃,等. 用文献计量研究重塑政策文本数据分析——政策文献计量的起源、迁移与方法创新[J]. 公共管理学报, 2015(2).
[54] 陈慧茹, 肖相泽, 冯锋. 科技创新政策加权共词网络研究——基于扎根理论与政策测量[J]. 科学学研究, 2016(12):12-19.
[55] 丁洁兰, 刘细文, 杨立英, 等. 科学计量方法在科技政策研究中应用的实证研究[J]. 图书情报工作, 2017(61):86.
[56] Laver M, Benoit K, Garry J. Extracting Policy Positions from Political Texts Using Words as Data[J]. American Political Science Review, 2003, 97(02):311-331.
[57] Slapin J B, Proksch S O. A scaling model for estimating time‐series party positions from texts[J]. American Journal of Political Science, 2008, 52(3): 705-722.
[58] Hopkins D J, King G. A method of automated nonparametric content analysis for social science[J]. American Journal of Political Science, 2010, 54(1): 229-247.
[59] Leifeld P, Haunss S. Political discourse networks and the conflict over software patents in Europe[J]. European Journal of Political Research, 2012, 51(3): 382-409.
[60] Nowlin M C. Modeling issue definitions using quantitative text analysis[J]. Policy Studies Journal, 2016, 44(3): 309-331.
[61] 李辉, 曾文, 吴晨生, 等. 中文科技政策数据分析方法研究——以新能源汽车领域科技政策为例[J]. 现代情报, 2018, v.38;No.324(06):70-74.
[62] 保罗·A.萨巴蒂尔. 政策过程理论[M]. 三联书店, 2004.
[63] 孙蕊, 吴金希, 王少洪. 中国创新政策演变过程及周期性规律[J]. 科学学与科学技术管理, 2016, 37(3):13-20.
[64] 李庆. 科技创新政策的转移、转移网络和竞争力研究:以国家自主创新示范区为例[D]. 2017.
[65] Piwowar H. Altmetrics: Value all research products[J]. Nature, 2013, 493(7431): 159.
[66] 苏竣, 黄萃. 中国科技政策要目概览[M]. 北京:科学技术文献出版社,2012.
[67] Lundvall B Å, Borrás S. Science, technology and innovation policy[J]. The Oxford handbook of innovation, 2005: 599-631.
[68] 苏竣. 公共科技政策导论[M]. 科学出版社, 2014.
[69] 常耀成, 张宇翔, 王红, 等. 特征驱动的关键词提取算法综述[J]. 软件学报, 2018, v.29(07):224-248.
[70] Entman R M, Rojecki A. Freezing out the public: Elite and media framing of the US anti‐nuclear movement[J]. 1993.
[71] 樊梦佳, 段东圣, 杜翠兰, 等. 统计与规则相融合的领域术语抽取算法[J]. 计算机应用研究, 2016, 33(8).
[72] 张越, 刘琦岩, 张玄玄, 望俊成. 科技成果转化政策文本中的领域关键词汇提取研究[J].中国科技资源导刊,2018,50(03):68-75.
[73] 曾文, 李智杰, 王小玉, 等. 科技政策术语自动识别技术初探[J]. 中国科技资源导刊, 2017(3).
[74] 国家行政机关公文处理办法. [EB/OL].
http:.www.gov.cn/gongbao/content/2000/content_60454.htm. 2000.
[75] 魏伟, 郭崇慧, 陈静锋. 国务院政府工作报告(1954—2017)文本挖掘及社会变迁研究[J]. 情报学报, 2018, v.37(04):70-85.
[76] Le Q V, Mikolov T. Distributed representations of sentences and documents[J]. arXiv:1405.4053, 2014.
[77] 苏金树, 张博锋, 徐昕. 基于机器学习的文本分类技术研究进展[J]. 软件学报, 2006, 17(9):1848-1859.
[78] Herbrich R. Large margin rank boundaries for ordinal regression[J]. Advances in large margin classifiers, 2000: 115-13.
[79] Joachims T. Optimizing search engines using clickthrough data[C]// Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2002: 133-142.
公开日期:

 2022-06-12    

基于蒙特卡罗算法的皮肤病诊疗路径关键技术研究.张瑾

链接

题名:

 基于蒙特卡罗算法的皮肤病诊疗路径关键技术研究    

作者:

 张瑾    

学号:

 1601210849    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 刘耀    

导师单位:

 软件与微电子学院    

第二导师姓名:

 高志军    

第二导师单位:

 软件与微电子学院    

答辩日期:

 2019-05-29    

题目(外文):

 Research on Key Technologies of Dermatosis Diagnosis and Treatment Path Based on Monte Carlo Algorithms    

关键字(中文):

 皮肤病诊疗 病历分析 蒙特卡罗算法 最短路径    

关键字(外文):

 Dermatology Diagnosis and Treatment Medical Record Analysis Monte Carlo Algorithms The Shortest Path    

文摘:

信息技术经过60余年的发展,已经普及到社会生活的各个方面。随着信息技术在各医学各领域的应用,大量数据随之产生。皮肤病是常见病及多发病,相关的病症种类多达一千多种,病历数据具有巨大的价值,其语义知识点可用于临床辅助诊疗和健康管理。目前全国皮肤科诊疗室面临着等待时间长、就医用药难、医师诊疗不准等多重问题,皮肤科医师迫切需要一种可以自动化推荐的计算机辅助诊疗工具,以辅助决策和智能医疗诊断。本文提出一种基于蒙特卡罗算法快速识别皮肤病诊疗依据知识点形成皮肤病诊疗最短路径的方法,用于给医师提供下一步最优推荐。

本文在对皮肤病病历结构及内容系统分析及总结归纳的基础上,以《皮肤科诊疗常规》为诊疗依据判定基础,结合蒙特卡罗算法特点及优势,提出了一套以病历诊疗为数据集、以诊疗依据提取与结构化为研究对象、生成皮肤病诊疗路径并基于蒙特卡罗算法计算训练出最短化方案,并通过实验研究验证该方案的可行性。本文研究重点在于如何通过对皮肤病传统的诊疗方法进行分析建模,形成一个能适应于蒙特卡罗算法进行计算的矩阵,如何根据病历及诊疗手册的结构与内容的对应关系提取出皮肤病诊疗依据,如何应用蒙特卡罗算法模拟计算、调参生成皮肤病诊疗最短路径,为诊疗提供支持。

本文的研究工作具体如下:分析皮肤病病历及诊疗手册的文本特征,对文本语义与结构信息进行深入挖掘,从中提取诊疗依据知识点的语义集合。基于文本分析方法模型和机器学习技术,形成能适应于蒙特卡罗算法计算的矩阵,构建出皮肤病诊疗模型。基于蒙特卡罗算法,探索并实现诊疗过程表示及结构化生成、诊疗路径计算与最短化处理关键技术,计算出皮肤病诊疗的最短路径。最后,通过实验论证了上述方法的有效性,可应用于下一步最优诊疗依据推荐。

文摘(外文):

Information technology has spread to society as a result of the development of more than 60 years. A large amount of data is generated with information technology applied in the field of medical science. Dermatosis is common and frequently-occurring, and there are more than 1,000 kinds of dermatoses now. Medical record data is of great value and its semantic knowledge points can be used for clinical assisted diagnosis and health management. At present, the national dermatology clinics face many problems such as long waiting time, difficulty in medical medicine, and inaccuracy in doctors' diagnosis. Thus, dermatologists need a computer-aided diagnosis tool urgently that can be recommended automatically to assist decision-making and intelligent medical diagnosis. This paper proposes a method to quickly extract diagnosis knowledge points to identify the shortest path of dermatological diagnosis and treatment based on Monte Carlo algorithms, which can be used to provide doctors with the recommendation for the next step.

This paper proposes a scheme for calculating the shortest path of dermatology diagnosis and treatment based on Monte Carlo algorithms after systematic analysis and summary of the structure and content of dermatology medical records. It takes the Routine of Dermatology Diagnosis And Treatment as the basis of diagnosis and treatment, the advantages of Monte Carlo algorithms, and the data set of diagnosis and treatment of dermatology to extract and structuralize of diagnosis knowledge points. Thus the shortest scheme is proposed, and the feasibility of the scheme is verified by experimental study. The focus of this paper is how to model traditional diagnosis and treatment methods to form a large matrix that can be adapted to Monte Carlo algorithms, how to extract the knowledge points of dermatology diagnosis and treatment corresponding to the structure and content of medical records and the Routines of Dermatology Diagnosis and Treatment, and how to adjust the parameters and calculate the shortest path of dermatosis diagnosis and treatment based on Monte Carlo algorithms.

The research work can be expressed as follows: first, analyzing the text characteristics of dermatological medical records and the Routines of Dermatology Diagnosis and Treatment. Based on the modeling of text structure, the knowledge points of diagnosis and treatment through automatic rule extraction are realized. Then, a large matrix suitable for Monte Carlo algorithms is formed on the basis of text analysis method model and machine learning technology, and a dermatological diagnosis and treatment model is constructed. After the diagnosis and treatment processes are represented and structured, Monte Carlo algorithms are used to evolve the diagnosis and treatment paths to calculate the shortest path of dermatological diagnosis and treatment. Finally, the effectiveness of the above methods is demonstrated by experiments, and the recommended system for the optimal diagnosis and treatment of dermatological intelligent diagnosis and treatment is designed and implemented.

分类号:

 TP3    

论文总页数:

 70    

参考文献数:

 105    

参考文献:
肖郑颖 . 老化皮肤光分布的蒙特卡洛模拟[j]. 生物医学工程研究, 2017.
吴淑莲. 老化皮肤光学特征提取及其治疗过程监测[d]. 福建师范大学, 2011.
朱学骏. 我国皮肤病的基础与临床发展现状[j]. 中国医学科学院学报, 2009, 31(1).
陈梅[1] , 吕晓娟[1] , 张麟[2] , et al. 人工智能助力医疗的机遇与挑战[j]. 中国数字医学, 2018.
lawrence d r, palaciosgonzález c, harris j. artificial intelligence. [j]. cambridge quarterly of healthcare ethics, 2016, 25(2):250-261.
csdn.(2018).alphago背后的力量:蒙特卡洛树搜索入门指南.https://blog.csdn.net/np4rh i455vg29y2/article/deta
胡波. 基于知识库的aiscp导医系统的设计与实现[d]. 苏州大学.
黄欢, 赵钢. 人工智能在医疗及神经病学领域的应用[j]. 华西医学, 2018, v.33(06):10-14.
陈仰东. 新体制下的“三医联动”及实现路径[j]. 中国医疗保险, 2018, 122(11):25-28.
谢俊祥, 张琳. 人工智能在皮肤病诊断中的应用[j]. 中国医疗器械信息, 2018, 24(17):31-33+145.
优麦医生. (2019).构建中国皮肤科专属生态圈. http://web.umer.com.cn/news
腾讯资讯. (2018). 盘点全球11个皮肤病ai项目:63%用于医生端,中国企业最多,皆与顶级医院合作. https://xw.qq.com/cmsid/20180508a09e4o00
thissen m , udrea a , hacking m , et al. mhealth app for risk assessment of pigmented and nonpigmented skin lesions - a study on sensitivity and specificity in detecting malignancy[j]. telemedicine and e-health, 2017, 23(12).
wolffenbuttel r f , wolffenbuttel hosli t m . medical apps in need of optical microspectrometers[j]. microsystem technologies, 2016, 22(7):1549-1555.
rickert de, h. (2018). how machine learning technology detects skin cancer. https:// www.skinvision.com/ articles/ how-machine-learning-detects-skin-cancer
esteva a , kuprel b , novoa r a , et al. dermatologist-level classification of skin cancer with deep neural networks[j]. nature, 2017, 542(7639):115-118.
邱龙杰.(2018).ai应用于医疗预测需整合机器学习行为演算法.https:// www.digitimes.com.tw/iot/article.asp?cat=158&id=0000536817_oll7vp4f2fcvj46ayqgvr
张丽, 商洪涛, 王彪, et al. 医院微信服务平台的设计与实现[j]. 中国医学装备, 2015(10):46-48.
颜红梅. 医学知识工程生产线与基于人工神经网络和遗传算法的医学决策支持系统的研究[d]. 重庆大学, 2003.
张慧玲, 宁立, 孟金涛, et al. 大规模图处理研究[j]. 网络新媒体技术, 2014, 3(1):26-30.
曹银. 试论手机媒体与图书出版[j]. 传播与版权, 2013(6):98-99.
sharma m, clark h, armour t, et al. acute stroke: evaluation and treatment[j]. evid rep technol assess, 2005(127):1-7.
hoeppner m a. ncbi bookshelf: books and documents in life sciences and health care[j]. nucleic acids research, 2013, 41(database issue):d1251-d1260.
shekelle p g , morton s c , keeler e b . costs and benefits of health information technology.[j]. health affairs, 2006, 28(2):w282-93.
gartlehner g , hansen r a , nissman d , et al. criteria for distinguishing effectiveness from efficacy trials in systematic reviews[m]// cell interactions in differentiation :. academic press, 2006.
cates s. ncbi: national center for biotechnology information[j]. connexions, 2006.
stoesser g, griffith m, griffith o l. ncbi (national center for biotechnology information)[m]// dictionary of bioinformatics and computational biology. 2014.
米洋. 基于xml的电子病历系统的设计与实现[d]. 河北科技大学, 2010.
俞文敏, 马培英, 黄美红. “军卫一号”护士工作站软件缺陷和解决对策[j]. 解放军医院管理杂志, 2006, 13(6):515-516.
李昊旻. 电子病历的标准化结构化方法研究及实践[d]. 浙江大学生物医学工程与仪器科学学院, 2007.
王海波. 新型医疗服务模式下电子病历管理的研究[d]. 山东师范大学, 2010.
薛万国. 我国电子病历研究进展[j]. 中国医院管理, 2005, 25(2):17-19.
waegemann c p. the five levels of electronic health records[j]. m.d.computing computers in medical practice, 1996, 13(3):199.
moen a, henry s b, warren j j. representing nursing judgements in the electronic health record[j]. journal of advanced nursing, 2010, 30(4):990-997.
wang x, chase h, markatou m, et al. selecting information in electronic health records for knowledge acquisition[j]. journal of biomedical informatics, 2010, 43(4):595-601.
mohammedrajput n a, smith d c, mamlin b, et al. openmrs, a global medical records system collaborative: factors influencing successful implementation.[j]. amia. annual symposium proceedings / amia symposium. amia symposium, 2011, 2011(694):960.
palmer r , simmscendan j , kim m . implementing an electronic health record as an ive measure of care provider accountability for a resource-poor rural area in the dominican republic[c]// international conference on appropriate healthcare technologies for developing countries. iet, 2013.
clay-williams r, nosrati h, cunningham f c, et al. do large-scale hospital- and system-wide interventions improve patient outcomes: a systematic review[j]. bmc health services research, 2014, 14(1):369.
余本功, 李娜, 江澍, et al. research on information integration platform for electronic health records based on the third party基于第三方的电子病历信息整合平台研究[j]. 计算机系统应用, 2008, 17(5):2-5.
sun k h , hune c , keun l i . development of an electronic claim system based on an integrated electronic health record platform to guarantee interoperability[j]. healthcare informatics research, 2011, 17(2):101-.
tychalas d . planning and development of an electronic health record client based on the android platform[c]// informatics. ieee, 2010.
keshavjee k, mirza k, martin k. the next generation emr[j]. stud health technol inform, 2015, 208:210-214.
sauer r , elke m . role of the electronic patient record in the development of general practice in the netherlands[j]. methods of information in medicine, 1999, 38(04/05):350-354.
rossi m a, consorti f, galeazzi e. standards to support development of terminological systems for healthcare telematics.[j]. methods inf med, 1998, 37(04/05):551-563.
smith b, ceusters w. hl7 rim: an incoherent standard.[j]. studies in health technology & informatics, 2006, 124(124):133.
klein a, ganslandt t, brinkmann l, et al. experiences with an interoperable data acquisition platform for multi-centric research networks based on hl7 cda.[j]. methods of information in medicine, 2007, 46(05):580-585.
blanquer i, hernandez v, salavert j, et al. using grid-enabled distributed metadata database to index dicom-sr.[j]. studies in health technology & informatics, 2009, 147:117.
李晓雅. 卫生部出台电子病历基本规范[j]. 中国社区医师(医学专业), 2010(11):149-149.
卫生部. 病历书写基本规范(试行)[j]. 中国卫生法制, 2002, 1(5):183-186.
基于蒙特卡罗树搜索的计算机扑克程序[d]. 北京邮电大学, 2014.
季辉, 丁泽军. 双人博弈问题中的蒙特卡洛树搜索算法的改进[j]. 计算机科学, 2018, 45(1):140-143.
董兆安. 二叉树枚举算法的研究[d]. 华东师范大学, 2005.
熊俊, 肖先勇, 邓武军,等. 基于广度优先搜索算法和区域节点行向量法的复杂配电网络可靠性评估[j]. 电网技术, 2007, 31(9):27-32.
陶华, 杨震, 张民,等. 基于深度优先搜索算法的电力系统生成树的实现方法[j]. 电网技术, 2010, 34(2):120-124.
段莉琼, 朱建军, 王庆社,等. 改进的最短路径搜索a*算法的高效实现[j]. 海洋测绘, 2004, 24(5):20-22.
crevier b, cordeau j f, laporte g. the multi-depot vehicle routing problem with inter-depot routes[j]. european journal of operational research, 2007, 176(2):756-773.
张杨, 亚森·艾则孜, 郭文强,等. 深度搜索在舆情控制系统中的应用研究[j]. 信息网络安全, 2013(4):91-92.
冯冲. 类人答题系统中的不等式问题自动求解的研究与实现[d].
彭丽, 李茂军. “机器人游中国”路径优化方法[j]. 工业控制计算机, 2012, 25(12):1-3.
佚名. 基于多智能体的运营高速铁路救援仿真研究[j]. 铁路计算机应用, 2018, 27(7):66-71.
张加佳. 基于uct算法的非完备信息多人军棋博弈系统[d]. 哈尔滨工业大学, 2008.
李营花, 张维, 黄祖广. 基于蒙特卡罗法的数控机床可靠性仿真[j]. 制造技术与机床, 2017(1):33-37.
陈经. 计算机处理围棋复杂的能力压倒了人类[j]. 物理, 2017, 46(9):616-623.
zhao j , qiu x , zhang s , et al. part-of-speech tagging for chinese-english mixed texts with dynamic features[c]// proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. association for computational linguistics, 2012.
qiu x, gong j, huang x. overview of the nlpcc 2017 shared task: chinese news headline categorization[c]// national ccf conference on natural language processing & chinese computing. 2017.
王海明. 基于tf-idf改进计算模型的实时大数据处理系统设计与实现[d].
zhou l, zhang d. nlpir: a theoretical framework for applying natural language processing to information retrieval[j]. journal of the american society for information science & technology, 2010, 54(2):115-123.
孙琳. 基于nlpir汉语分词系统和bfsu powerconc 1.0的警务汉语词频与搭配研究——以禁毒案件为例[j]. 现代语文:语言研究版, 2016(12):140-145.
li x , zhang c . research on enhancing the effectiveness of the chinese text automatic categorization based on ictclas segmentation method[c]// ieee international conference on software engineering & service science. 0.
csdn. (2018) 中文分词原理及分词工具绍. https://blog.csdn.net/qq_26598445/article/de
tails/81298456
min s, chambers t. text mining with the stanford corenlp[m]// measuring scholarly impact. 2014.
emani c k, cullot n, nicolle c. understandable big data: a survey[j]. computer science review, 2015, 17:70-81.
solaimani m, gopalan r, khan l, et al. spark-based political event coding[c]// ieee second international conference on big data computing service & applications. 2016.
burgard w, brock o, stachniss c. crf-matching: conditional random fields for feature-based scan matching[m]// robotics:science and systems iii. 2007.
taub l . applying conditional random fields to payload anomaly detection with crfpad[c]// southeastcon, ieee. ieee, 2013.
hong, tzungpei, lin, et al. using tf-idf to hide sensitive itemsets[j]. applied intelligence, 2013, 38(4):502-510.
hakim a a , erwin a , eng k i , et al. automated document classification for news article in bahasa indonesia based on term frequency inverse document frequency (tf-idf) approach[c]// international conference on information technology & electrical engineering. ieee, 2015.
zhenjun l i, zhou z. improvement of term frequency-inverse document frequency algorithm based on document triage[j]. journal of computer applications, 2015.
佚名. 基于synonyms、k-means的短文本聚类算法[j]. 电脑知识与技术, 2019.
施聪莺, 徐朝军, 杨晓江. tfidf算法研究综述[j]. 计算机应用, 2009, 29(b06):167-170.
路永和, 李焰锋. 改进tf-idf算法的文本特征项权值计算方法[j]. 图书情报工作, 2013, 57(3):90-95.
张瑾. 基于改进tf-idf算法的情报关键词提取方法[j]. 情报杂志, 2014(04):153-155.
anandkumar a, foster d p, hsu d, et al. a spectral algorithm for latent dirichlet allocation[j]. algorithmica, 2015, 72(1):193-214.
ganesan a, brantley k, pan s, et al. ldaexplore: visualizing topic models generated using latent dirichlet allocation[j]. 2015.
chen j, li k, zhu j, et al. warplda: a cache efficient o(1) algorithm for latent dirichlet allocation[j]. proceedings of the vldb endowment, 2016, 9(10):744-755.
martinez o, tsechpenakis g. integration of active learning in a collaborative crf[c]// ieee computer society conference on computer vision & pattern recognition workshops. 2008.
yang r n b. an online learned crf model for multi-target tracking[c]// computer vision & pattern recognition. 2012.
shen f, rui g, yan s, et al. semantic segmentation via structured patch prediction, context crf and guidance crf[c]// ieee conference on computer vision & pattern recognition. 2017.
王岩, 尹海丽, 窦在祥. 蒙特卡罗方法应用研究[j]. 青岛理工大学学报, 2006, 27(2):111-113.
向文武. 基于决策树与蒙特卡罗模拟集成模型的石油勘探投资决策分析[j]. 当代石油石化, 2017, 25(1):44-49.
刘子正, 卢超, 张瑞友. 基于蒙特卡罗树搜索的“2048”游戏优化算法[j]. 控制工程, 2016, 23(4):550-555.
max.book. (2016). 蒙特卡罗博弈方法. https://max.book118.com/html/2016/0815/5147573
3.shtm
何长春, 廖继海, 杨小宝. 平面团簇稳定结构的蒙特卡罗树搜索[j]. 物理学报, 2017, 66(16):82-88.
csdn. (2016). 一种ucb1算法的简单实现及效果对比. https:// blog.csdn.net/ wangweiran1/article/details/50533275
基于静态评估的计算机围棋uct算法改进研究[d]. 南昌航空大学, 2015.
佚名. 计算机围棋博弈中uct算法的应用及改进[d]. 北京邮电大学, 2011.
柴伟凡, 梁志伟, 夏晨曦. 基于蒙特卡洛树搜索的仿真足球防守策略研究[j]. 微型机与应用, 2017, 36(23):54-57+61.
nasser m, kurosh r. cheminform abstract: the first synthesis of allyl isonitriles from baylis-hillman adducts, and their application in the synthesis of substituted imidazo[1,2-a]pyridines and tetraazadibenzoazulenes[j]. synthesis, 2009, 2009(03):431-437.
福蒙蒙, 陈躲, 谢晶. 第四范式:让人工智能触手可及[j]. 全球商业经典, 2017.
蒋卫民. “深层学习”的思考[j]. 广西林业, 2016(3):1-1.
孙英龙. 非完美信息博弈算法研究与军棋博弈系统设计与实现[d]. 2013.
皮肤性病学(第5版)[m]. 2004.
罗汉超. 喜读《英汉皮肤科学词典》[j]. 中国皮肤性病学杂志, 1990(4).
李军莲. mesh词表的新变化及有关标引规则[j]. 医学信息学杂志, 2007, 28(3):285-28.
马怡梅. his系统在医院管理中的应用研究[d].
公开日期:

 2022-06-21    

面向领域的先进技术侦测关键技术研究.张茜

链接

题名:

 面向领域的先进技术侦测关键技术研究    

作者:

 张茜    

学号:

 1601210859    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 刘耀    

导师单位:

 中国科学技术信息研究所    

第二导师单位:

 北京大学    

答辩日期:

 2019-05-29    

题目(外文):

 Research on Domain-Oriented Advanced Technology Detection    

关键字(中文):

 先进技术 技术侦测 文本挖掘    

关键字(外文):

 Advanced Technology Technology Detection Text Mining    

文摘:

本文主要针对现有技术侦测研究中缺乏先进技术侦测综合模型的问题,利用领域科学文献对先进技术侦测的关键技术进行研究。经过调研发现,在先进技术侦测中,技术点挖掘及其先进性特征在文本中的体现是构建合理有效的先进技术侦测模型的重要任务。因此,本文首先根据先进技术侦测思想方法构建起一个包含领域潜在技术点挖掘和领域潜在先进技术挖掘及其特征发现的侦测模型,之后,对领域先进技术特征建立融合模型以完善侦测模型设计。与此同时,建立针对技术先进性评价的指标体系,最后,在多个技术领域对初始模型进行实验,将实验结果与评价指标进行对比,以优化先进技术侦测模型。具体研究内容有以下几点:

首先,本文探讨了不同类型科学文献资源在技术点获取中的特点并根据文献特点制定了相应的技术词获取策略,并且多源文献的特点也为先进技术文本特征的选取提供了依据。同时,也利用科学文献资源建立了领域知识库,领域知识库的概念结构将帮助后续研究更好地挖掘先进性文本特征。本部分提出了针对技术点特点的TFIDFC-value技术字串提取方法,实验证明该方法具有一定有效性。通过该方法获取的领域技术点,将作为领域先进技术挖掘的基本技术词和先进性评价的部分对象。

其次,本文选取了技术生命周期、领域技术主题演化、领域科技文本术语、领域专利布局四方面以提取技术先进性文本特征,并假设先进技术位于技术生命周期的萌芽期和成长期,出现在领域技术演化的新主题、领域项目文本中的新术语和领域内大公司的非主流专利布局中,并根据假设和基本技术词,提取可能具有先进性的候选技术词,扩大了技术词获取范围。之后,本文根据相关研究总结归纳出技术先进性评价指标体系,基于此前提取的技术文本特征信息进行融合,融合基于技术成熟度、技术知识扩散等理论。指标体系将用于领域先进技术侦测,与先进技术特征挖掘共同构成了初始先进技术侦测模型。

最后,本文选取自动驾驶汽车和物联网领域作为先进技术侦测模型的回溯实验对象,实验证明初始先进技术侦测模型有效,并根据回溯实验结果从提升技术点专业性和单元性角度出发进行侦测模型改进,并将改进后的回溯实验结果与原结果比较分析,实验结果表明,改进后的模型一定程度上提升了排名靠前的候选技术点的先进性侦测准确度。

文摘(外文):

this paper mainly discusses the key technologies of advanced technology detection in the field of scientific literature. through research and development, it is found that in advanced technology detection, technology point mining, as well as the embodiment of its advanced characteristics in the text, is an important task to build a reasonable and effective advanced technology detection model. therefore, this paper first constructs a detection model including domain potential technology point mining, domain potential advanced technology mining, and feature discovery based on the advanced technology detection theories and methodologies, and establishes a fusion model for domain advanced technology features to improve the initial model design. at the same time, an index system for the evaluation of technological advancement is established. finally, the initial model is tested in many technical fields, and the experimental results are compared with the evaluation index to optimize the detection model of advanced technology. specific research contents are as follows:

firstly, this paper discusses the characteristics of different types of scientific literature resources in the acquisition of technical points, and formulates corresponding acquisition strategies of technical terms based on the characteristics of the literature, and the characteristics of multi-source literature also provide a basis for the selection on advanced technical text features. at the same time, the domain knowledge base is established by using scientific literature resources. the conceptual structure of the domain knowledge base will help follow-up research that further excavates advanced text features. in this part, a tfidfc-value string extraction method based on the characteristics of technical points is proposed. experiments show that the method is effective. the domain technology points obtained by this method will be regarded as the basic technical terms of domain advanced technology mining and part of the of advanced evaluation.

secondly, this paper selects four aspects of technology life cycle, domain technology theme evolution, domain technology text terminology, and domain patent layout to extract the text characteristics of technology advancement, and assumes that advanced technology lies in the germination and growth of technology life cycle, new topics of domain technology evolution, new terminology of domain project text and non-mainstream specialty of large companies in the domain. according to hypothesis and basic technical words, candidate technical words which may be advanced are extracted in the favorable layout, which enlarges the scope of technical words acquisition. then, according to the relevant research, this paper summarizes the evaluation index system of technological advancement. based on the feature information extracted before, it fuses the theory of technological maturity and diffusion of technological knowledge. the index system will be used in the field of advanced technology detection, and together with the mining of advanced technology features, it will constitute the initial advanced technology detection model.

finally, this paper chooses self-driving automobile and internet of things as the backtracking experimental of advanced technology detection model. experiments show that the initial advanced technology detection model is effective, and based on the backtracking results, we improve the detection mode from the perspective of enhancing the expertise and unit nature of the technology points, as well as comparing the improved backtracking results with the original results. to some extent, the improved model improves the advanced detection accuracy of the top candidate technology points.

分类号:

 TP3    

论文总页数:

 66    

参考文献数:

 60    

参考文献:
[1] 刘剑兰, 朱东华. 信息抽取技术在情报监测中的应用[J]. 情报学报, 2004, 23(6):661-666.
[2] 丁俊丽,赵国杰,李光泉.对技术本质认识的历史考察与新界定[J].天津大学学报(社会科学版), 2002,(1):88—92.
[3] 闫宏秀.技术进步与价值选择[D].上海:复旦大学,2003.10,31,21.
[4] 管晓刚.关于技术本质的哲学释读[J].自然辩证法研究,2001,(12):18—22.
[5] Erstling J. International technology transfer and intellectual property rights: Some essentials and options for technology transfer partners[J]. The International Executive, 1992, 34(3): 215-236.
[6] 李素建, 王厚峰, 俞士汶等, 关键词自动标引的最大熵模型应用研究, 计算机学报, 2004,9:1192-1197.
[7] Watts R J, Porter A L. Innovation forecasting[R]. ARMY TANK-AUTOMOTIVE COMMAND WARREN MI, 1997.45-50.
[8] Arthur D Little.The Strategic Management of Technology[M].Cambridge,Mass, 1981:146-163
[9] 福斯特. 创新:进攻者的优势[M]. 中信出版社, 2008:72-78..
[10] 杰姬·芬恩, 马克·拉斯金诺. 精准创新(第二版)[M]. 中国财富出版社, 2015:86-90.
[11] 黄鲁成, 历妍. 基于专利的技术发展趋势评价系统[J].系统管理学报,2010(8):384-388.
[12] 于晓勇,赵晓晨,等.基于专利信息分析的我国电动汽车的技术发展趋势研究[J].科学学与科学技术管理,2011(4):45-51.
[13] 王燕玲. 面向企业技术创新的专利分析框架研究[J]. 科技管理研究, 2013, 33(5):131-136
[14] Kim J, Hwang M, Jeong D H, et al. Technology trends analysis and forecasting application based on decision tree and statistical feature analysis[J]. Expert Systems with Applications, 2012, 39(16): 12618-12625.
[15] KIM J, LEE S, LEE J, et al. Design of TOD Model for Information Analysis and Future Prediction [J]. Communications in Computer and Information Science, 2011, 264: 301-305.
[16] Yoon B, Park Y. A text-mining-based patent network: Analytical tool for high-technology trend[J]. Journal of High Technology Management Research, 2004, 15(1):37-50.
[17] Shin J, Park Y. Analysis on the dynamic relationship among product attributes: VAR model approach [J]. Journal of High Technology Management Research, 2005, 16(2):225-239.
[18] Ozcan S, Islam N. An empirical study of nanowire technological trends[J]. The Journal of High Technology Management Research, 2017, 28(2): 246-260.
[19] 赵龙. 基于专利耦合和文本挖掘的技术演化分析——以二氧化碳捕集与存储领域为例[D]. 中国科学技术信息研究所, 2015:8-9.
[20] Liu Y, Wang R. RESEARCH ON SEMANTIC METADATA ONLINE AUXILIARY CONSTRUCTION PLATFORM AND KEY TECHNOLOGIES[J]. ICIC Express Letters Part B, 2013,4(4), 897-904
[21] Liu Y, et al. Research on semantic and syntactic analysis of patent literature. ICIC Express Letters, 2016, 10(2):471-477.
[22] 刘辉,刘耀. 基于条件随机场的专利术语抽取[J]. 数字图书馆论坛, 2014(12):46-49.
[23] 饶慧. 信息抽取技术在情报监测中的应用[J].科技尚品, 2016,(7):153,156.
[24] 胡立诺, 胡立岩. 技术检测中的信息抽取技术的应用分析[J]. 价值工程, 2014(21):236-237.
[25] 单斌, 李芳. 基于LDA话题演化研究方法综述[J]. 中文信息学报, 2010, 24(6):43-50.
[26] Griffiths T L, Jordan M I, Tenenbaum J B, et al. Hierarchical topic models and the nested chinese restaurant process[C]//Advances in neural information processing systems. 2004: 17-24.
[27] ROSEN-ZVI M,GRIFFITHS T,STEMVERS M, et al. The author-topic model for authors and documents[C]//Proceedings of the20th conference on uncertainty in artificial intelligence. Arlington: AUAI Press, 2004: 487-494.
[28] Steyvers M, Smyth P, Rosen-Zvi M, et al. Probabilistic author-topic models for information discovery[C]//Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004: 306-315.
[29] Rosen-Zvi M, Chemudugunta C, Griffiths T, et al. Learning author-topic models from text corpora[J]. ACM Transactions on Information Systems (TOIS), 2010, 28(1): 4.
[30] Blei D M, Lafferty J D. Dynamic topic models[C]//Proceedings of the 23rd international conference on Machine learning. ACM, 2006: 113-120.
[31] Liu Y, et al. Research on feature acquisition and key expression technology of knowledge-intensive text. ICIC Express Letters, Part B: Applications, 2014, 5(1): 57-64.
[32] 张丽玮, 郑彦宁. 高新技术项目技术风险评估体系构建研究[J]. 科学管理研究, 2014(2):36-39.
[33] 刘铭, 姚岳. 企业技术创新绩效评价指标体系研究[J]. 甘肃社会科学, 2014(4):233-236.
[34] 程文渊等. 美军重大国防采办项目技术成熟评价的价值分析研究[J]. 科研管理, 2017(s1):71-77.
[35] Boer F P. The valuation of technology : business and financial issues in R&D[J]. Research-Technology Management, 1999, 42.
[36] Diamantopoulos F, Economides A A. Performance evaluation of power control routing for ad-hoc networks[C]// Wireless Conference 2006 - Enabling Technologies for Wireless Multimedia Communications. VDE, 2006:1-6.
[37] 马向阳, 辛荣. 政府视角下以区域联想为核心的区域品牌伞构建研究[J]. 科技进步与对策, 2013(15):46-51.
[38] 李雪凤, 仝允桓. 技术价值评估方法的研究思路[J]. 科技进步与对策, 2005, 22(10):75-77.
[39] 修国义, 韩佳璇, 陈晓华. 区域创新驱动能力影响因素实证研究[J]. 金融与经济, 2017(5):49-54.
[40] 高艳红, 杨建华, 杨帆. 技术先进性评估指标体系构建及评估方法研究[J]. 科技进步与对策, 2013, (5):138-142.
[41] 郭俊芳. 基于语义挖掘的技术创新路径分析与评价方法研究[D].北京理工大学,2016:67-96.
[42] 曾文, 徐硕, 张运良, 等. 科技文献术语的自动抽取技术研究与分析[J]. 数据分析与知识发现, 2014, 30(1).
[43] 邢红兵. 信息领域汉英术语的特征及其在语料中的分布规律[J]. 产品安全与召回, 2000(3):17-21.
[44] 张榕. 术语定义抽取、聚类与术语识别研究[D]. 北京语言大学, 2006:23-24
[45] 何燕, 穗志方, 段慧明,等. 一种结合术语部件库的术语提取方法[J]. 计算机工程与应用, 2006, 42(33):4-7.
[46] 李嵩. 语言学文献标题的术语提取研究[D]. 山东大学, 2007:13-15.
[47] 韩红旗, 朱东华, 汪雪锋. 专利技术术语的抽取方法[J]. 情报学报, 2011, 30(12):1280-1285.
[48] 常鹏, 马辉. 高效的短文本主题词抽取方法[J]. 计算机工程与应用, 2011, 47(20):126-128.
[49] CHARLES. CLOCKSPEED - WINNING INDUSTRY CONTROL IN AGE OF TEMPORARY ADVANTAGE[J]. Supply Chain Management, 1998, 40(3):104-104.
[50] Liu Y, et al. Study on semantic annotation for professional literature[J].ICIC Express Letters, Part B: Applications,2015, 5(5): 1383-1389.
[51] .Brin S , Page L . The anatomy of a large-scale hypertextual Web search engine[J]. Computer Networks and ISDN Systems, 1998, 30(1-7):107-117.
[52] R. Foster. Boosting the Payoff from R&D [J]. Research Management, 1982, 25: 22-27.
[53] Altuntas S , Dereli T , Kusiak A . Forecasting technology success based on patent data[J]. Technological Forecasting and Social Change, 2015, 96:202-214.
[54] 王超, 武华维, 赵燕清, et al. 基于创新全过程的知识内容扩散强度分析模型研究[J]. 情报理论与实践, 2018, 41(10):69+141-146.
[55] Jarvenpaa, Makinen. An empirical study of the existence of the Hype Cycle: A case of DVD technology[C]// IEEE International Engineering Management Conference. IEEE, 2008.
[56] autonomous-vehicles[EB/OL]. https://www.gartner.com/it-glossary/autonomous-vehicles/
[57] 清华大学-中国工程院知识智能联合研究中心. 2018年人工智能之自动驾驶研究报告 [EB/OL]. (2018-7)[2019-04-01]. https://static.aminer.cn/misc/article/selfdriving-new.pdf
[58] Benoit Lheureux , W. Roy Schulte , Alfonso Velosa. Hype Cycle for the Internet of Things, 2017 [EB/OL]. (2017-7) [2019-04-01]. https://www.gartner.com/en/documents/3770369
[59] Bouma G. Normalized (pointwise) mutual information in collocation extraction[J]. Proceedings of GSCL, 2009: 31-40.
[60] 杜丽萍, 李晓戈, 于根, 等. 基于互信息改进算法的新词发现对中文分词系统改进[J]. 北京大学学报(自然科学版), 2016, 52(1):35-40
公开日期:

 2022-06-09    

基于层次条件变分自编码器的政府公文自动生成系统的设计与实现.邓雅妮

链接

题名:

 基于层次条件变分自编码器的政府公文自动生成系统的设计与实现    

姓名:

 邓雅妮    

学号:

 1601210498    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 姚亚芝    

导师1单位:

 软件与微电子学院    

导师2姓名:

 俞敬松    

导师2单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-29    

外文题名:

 THE AUTOMATIC GOVERNMENT OFFICIAL DOCUMENT GENERATION BASED ON HIERARCHICAL CONDITIONAL VARIATIONAL AUTOENCODER    

关键词:

 长短时记忆网络 条件变分自编码器 关键词抽取 政府公文写作 文本生成    

外文关键词:

 LSTM CVAE Keyword extraction Government document Automatic text generation    

论文摘要:

近年来,文本生成是自然语言处理领域(Natural Language Processing)一项极具挑战的任务,在解决短文本生成和诗歌生成等方面都取得了不错的进展,但由于当文本变长会造成信息丢失、误差传递和错误偏移等问题,因此在长文本生成上的研究还处于初步阶段,特别是中文长文本生成,而政府公文的生成又是中文长文本生成中特殊的一种。政府公文是我国传达政治任务、表达政治观点以及记录历史事件的特殊文化遗产,有着独特的行文思路和措辞特点,其生成任务所面临的难点与长文本生成有诸多共通之处,都希望生成具有用词多样性(wording diversity)和主题一致性(thematic consistency)的文本。用词多样指句中使用了多样化的词语来表词达意,而不是重复地使用单调的字或词;主题一致指文本句与句之间和句子内部词与词之间阐述的为同一主题。

随着深度学习方法的普及,在文本生成中seq2seq是一种常用的高质量文本生成框架。VAE的引入可以使得seq2seq的生成过程更具多样性,同时学者们发现将生成条件引入VAE中构成CVAE,可以进一步提高句子的内部主题一致性和句子用词多样性。在近期的工作中,关键词也被证实可以作为中间的生成结果来进一步提高句子与句子之间的主题一致性。

虽然CVAEs等已被证实可以用来进行文本生成,但是它们的生成指向性不足,并且不能很好地保证主题一致性以及生成更加多样化的用词。本文试图通过加入类似写作提纲的关键词得到Key-CVAE,使得模型在生成中文政府公文的过程中不仅可以考虑词和词的主题一致性还能进一步优化句子与句子之间的主题一致性。

实验表明,本文模型Key-CVAE不仅在本文构建的政府公文数据集上在篇章和句中主题一致性上取得了高于预期的效果,并且在一系列对比实验中验证了关键词和CVAE的结合不仅加强了CVAE的主题一致性,还保持了用词多样性的性能,同时验证了训练数据集的多样性对模型生成结果的影响。目前,虽然长文本生成技术在中文任务上只是初期探索阶段,但本文引入的Key-CVAE模型具有很好的参考研究价值,为以后长文本生成任务的研究提供了新的思路。

外文摘要:

Text generation is a challenging task in Natural Language Processing(NLP). Although text generation has achieved success in many fields such as Short-text generation and Poetry generation. But, when the text becomes longer, it will cause problems such as information loss, error transmission, error migration, etc. Therefore, the research on Long-text generation is still in its preliminary stage, especially in the Chinese Long-text generation, such as the Government document generation which is a special kind of Long-text generation. Government document is a unique cultural heritage with its special use and combination of words. Aiming to publish political tasks, express political views and note historical events. However, the challenges government documents face have much in common with traditional texts, like wording diversity and thematic consistency. Wording diversity highlights the type of words used and thematic consistency emphasized the consistency of theme between sentences and words.

With the popularity of deep learning methods, Seq2Seq is a commonly used and a high-quality text generation framework in text generation. The use of VAE can make the generation process of seq2seq more diverse. Scholars have also found that the generation conditions of CVAE can further improve the VAE’s text generation in internal theme consistency and wording diversity. In recent work, keywords serve as intermediate generation results have also been shown can further improve the topical consistency between sentence and sentence.

Although CVAEs have been proven to be useful for text generation, but their generation is not sufficiently directed. This paper attempts to propose the model: keyword-enhanced conditional variation autoencoder (Key-CVAE) to solve the problem of Chinese government document generation by adding the keywords as writing outline in the consistency of theme between sentences and words.

Experiments have shown that the model Key-CVAE not only achieves higher-than-expected effect on the theme consistency in the government document data set constructed in this paper, but also proved that the combination of keywords and CVAE not only enhanced the theme consistency of the CVAE model, but also maintained the performance of it’s wording diversity, and verified the diversity of the training data set have an impact on the model generation, in a series of comparative experiments. Although Long-text generation is in the preliminary stage in chinese tasks, but the Key-CVAE model introduced in this paper has reference research value which provides a new idea for the research of Long-text generation tasks.

分类号:

 TP3    

论文总页数:

 62    

参考文献总数:

 50    

参考文献列表:
[1]夏海波.公文写作与处理[m].北京大学出版社.(20110401)
[2]车颖.政府办公室文件规范化管理的路径探索[j].办公室业务.(201810)
[3]王兆胜.政府公文写作特征与提高途径[j].办公室业务.(2017)
[4]蒋锐滢,崔磊,何晶,周明,潘志庚.基于主题模型和统计机器翻译方法的中文格律诗生成[j].计算机学报.(2015)
[5]覃江华.政府公文的语篇特征与汉译英技巧[n].重庆交通大学学报(社科版).(201206)
[6]赵宇晴,向阳.基于分层编码的深度增强学习对话生成.计算机应用.(201710)
[7]boyang ding, quan wang, bin wang, li guo.improving knowledge graph embedding using simple constraints. arxiv preprint arxiv:1805.02408v2.(2018)
[8]martin l j, ammanabrolu p, wang x, et al. event representations for automated story generation with deep neural nets[j]. arxiv preprint arxiv:1706.01331.(2017)
[9]sutskever, i., vinyals, o., & le, q. v.sequence to sequence learning with neural networks. in advances in neural information processing systems (pp. 3104-3112).(2014).
[10]bowman, s. r., vilnis, l., vinyals, o., dai, a. m., jozefowicz, r., & bengio, s.generating sentences from a continuous space. arxiv preprint arxiv:1511.06349.(2015)
[11]doersch, c.tutorial on variational autoencoders. arxiv preprint arxiv:1606.05908.(2016)
[12]ian goodfellow, jean pouget-abadie, mehdi mirza, bing xu, david warde-farley, sherjil ozair, aaron courville, and yoshua bengio.generative adversarial nets. in advances in neural information processing systems. montreal, canada, pages 2672–2680.(2014)
[13]zhao, t., zhao, r., & eskenazi, m.learning discourse-level diversity for neural dialog models using conditional variational autoencoders. arxiv preprint arxiv:1703.10960.(2017)
[14]li, j., song, y., zhang, h., chen, d., shi, s., zhao, d., & yan, r.generating classical chinese poems via conditional variational autoencoder and adversarial training. in proceedings of the 2018 conference on empirical methods in natural language processing (pp. 3890-3900).(2018).
[15]hochreiter, s., & schmidhuber, j.long short-term memory. neural computation, 9(8),1735-1780.(1997)
[16]xu, j., zhang, y., zeng, q., ren, x., cai, x., & sun, x.a skeleton-based model for promoting coherence among sentences in narrative story generation. arxiv preprint arxiv:1808.06945.(2018)
[17]fan, a., lewis, m., & dauphin, y.hierarchical neural story generation. arxiv preprint arxiv:1805.04833.(2018)
[18]wang, z., he, w., wu, h., wu, h., li, w., wang, h., & chen, e.chinese poetry generation with planning based neural network. arxiv preprint arxiv:1610.09889. (2016)
[19]yang, x., lin, x., suo, s., & li, m.generating thematic chinese poetry using conditional variational autoencoders with hybrid decoders. arxiv preprint arxiv:1711.07632.(2017)
[20]kishore papineni, salim roukos, todd ward, and wei-jing zhu .bleu: a method for automatic evaluation of machine translation. in proceedings of the 40th annual meeting of the association for computational linguistics (acl), philadelphia, july 2002(pp. 311-318).(2002)
[21]swanson, r., and gordon, a. s.say anything: using textual case-based reasoning to enable open-domain interactive storytelling. acm transactions on interactive intelligent systems (tiis) 2(3):16.(2012)
[22]tu, z.; liu, y.; shi, s.; and zhang, t.learning to remember translation history with a continuous cache. arxiv preprint arxiv:1711.09367.(2017)
[23]wang, x; chen, w.; wang, y.f.; and wang, w. y.no metrics are perfect: adversarial reward learning for visual storytelling. arxiv preprint arxiv:1804.09160.(2018)
[24]wiseman, s.; shieber, s.; and rush, a.challenges in data-to-document generation. in emnlp, 2253–2263.(2017)
[25]yan, x.; yang, j.; sohn, k.; and lee, h.at- tribute2image: conditional image generation from visual attributes. in european conference on computer vision, 776–791. springer.(2016)
[26]zhang, j.; feng, y.; wang, d.; wang, y.; abel, a.; zhang, s.; and zhang, a.flexible and creative chinese poetry generation using neural memory. in acl, volume 1, 1364–1373.(2017)
[27]lantao yu, weinan zhang, jun wang, yong yu.seqgan: sequence generative adversarial nets with policy gradient. arxiv preprint arxiv: 1609.05473v6.(2017)
[28]jain, parag,agrawal, priyanka,mishra, abhijit.story generation from sequence of independent short deions. arxiv preprint arxiv:1707.05501.
[29]li, j.; luong, m.-t.; and jurafsky, d.a hierarchical neural autoencoder for paragraphs and documents. in acl, volume 1, 1106–1115.(2015)
[30]martin, l. j.; ammanabrolu, p.; hancock, w.; singh, s.; harrison, b.; and riedl, m. o.event representations for automated story generation with deep neural nets. arxiv preprint arxiv:1706.01331.(2017)
[31]may, n. p. m. g. j., and knight, k.towards controllable story generation. naacl workshop.(2018)
[32]jingjing xu, xuancheng ren, junyang lin, and xu sun.diversity-promoting gan: a cross-entropy based generative adversarial network for diversified text generation. in emnlp.(2018)
[33]jingjing xu, xu sun, qi zeng, xuancheng ren, xiaodong zhang, houfeng wang, and wenjie li.unpaired sentiment-to-sentiment translation: a cycled reinforcement learning approach. in acl.(2018)
[34]brent harrison, christopher purdy, and mark o riedl. toward automated story generation with markov chain monte carlo methods and deep neural networks. in proceedings of the 2017 workshop on intelligent narrative technologies.(2017)
[35]boyang li, stephen lee-urban, george johnston, and mark o. riedl. story generation with crowdsourced plot graphs.(2013)
[36]erica greene, tugba bodrumlu, kevin knight. automatic analysis of rhythmic poetry with applications to generation and translation.proceedings of the 2010 conference on empirical methods in natural language processing, emnlp 2010, 9-11 october 2010, mit stata center,massachusetts, usa, a meeting of sigdat, a special interest group of the acl.(2010)
[37]hu z, yang z, liang x, et al. toward controlled generation of text[j].(2018)
[38]iulian vlad serban, alessandro sordoni, ryan lowe, laurent charlin, joelle pineau, aaron courville,yoshua bengio.a hierarchical latent variable encoder-decoder model for generating dialogues.arxiv preprint arxiv:1605.06069v3.(2016)
[39]jiwei li1,michel galley,chris brockett,jianfeng gao,bill dolan.a diversity-promoting ive function for neural conversation models. arxiv preprint arxiv: 1510.03055v3.(2016)
[40]juntao li, lidong bing, lisong qiu, dongmin chen, dongyan zhao and rui yan. learning to write creative stories with thematic consistency. in aaai.(2019)
[41] zhao, tiancheng,zhao, ran,eskenazi, maxine.learning discourse-level diversity for neural dialog models using conditional variational autoencoders. 10.18653/v1/p17-1061.(2017)
[42]zhiting hu, zichao yang, xiaodan liang, ruslan salakhutdinov, eric p. xing.toward controlled generation of text. arxiv preprint arxiv:1703.00955v4.(2018)
[43]xiaopeng yang, xiaowen lin, shunda suo, ming li.generating thematic chinese poetry with conditional variational autoencoder. arxiv preprint arxiv:1711.07632v1.(2018)
[44]jianmin bao, dong chen, fang wen, houqiang li, gang hua.cvae-gan: fine-grained image generation through asymmetric training. arxiv preprint arxiv: 1703.10155v2.(2017)
[45]pengfei liu, xipeng qiu, xuanjing huang.recurrent neural network for text classification with multi-task learning. proceedings of the twenty-fifth international joint conference on artificial intelligence (ijcai-16)
[46]zachary c,lipton.a critical review of recurrent neural networks for sequence learning.arxiv preprint arxiv: 1506.00019v1.(2015)
[47]lili yao,nanyun peng,ralph weischedel,kevin knight,dongyan zhao,rui yan.plan-and-write: towards better automatic storytelling .aaai.(2019)
[48]jiyuan zhang, yang feng, dong wang, yang wang, andrew abel, shiyue zhang, andi zhang.flexible and creative chinese poetry generation using neural memory. arxiv preprint arxiv:1705.03773.(2017)
[49]rada, mihalcea, and paul tarau.textrank: bringing order into texts.empirical methods in natural language processing. (2004)
[50]bing liu,philip s. yu.the top ten algorithms in data mining.chapter 6.(1998)
公开日期:

 2019-06-17    

一种英语写作知识点推荐策略.Tianfang Gao

链接

题名:

 一种英语写作知识点推荐策略    

姓名:

 Tianfang Gao    

学号:

 1601210521    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 李博婷    

导师1单位:

 软件与微电子学院    

导师2姓名:

 俞敬松    

导师2单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-29    

外文题名:

 A Strategy for Recommending English Writing Knowledge Points    

关键词:

 英语写作 知识点推荐 策略制定    

外文关键词:

 English Writing Knowledge Points Recommendation Strategy Making    

论文摘要:

无论在考试还是在日常生活中,英语写作都是中国学生不可回避的难题。目前,市面上虽然存在包括书籍与网站在内的各种英语写作教学资源,但这些资源的作用相对有限。相关书籍只是理论和语料资源的搜集整合;而写作辅助系统为学生的文章提供分数判定和错误批改,并没有结合学生的写作水平和写作目的针对性地在写后阶段帮助学生获得写作能力的提升。

英语写作教学理论繁多,在众多英语写作教学理论中,让学生通过不断改写文章来提升写作能力是广受认可的一种方案。本系统根据此教学理念设计。为了得到适合每个学生的文章改写方向,本系统充分考量每个学生的写作记录,设计并实现了一种英语写作知识点推荐策略。该策略在运行时可以修正自己的反馈以适应对象的变化。对于英语写作教学,该策略的目的是为学生推荐出其最需要学习的内容以提升其写作能力,本文将这些内容定义为学生使用次数较少而母语者使用次数较多的知识点。

对于中国学生在英语写作时遇到的无词可用、难以连词成句、表达不够地道等问题,本文提出了解决方案并对方案进行了验证。使用系统时,学生输入一篇自己的习作,系统对学生该篇习作和学生的写作历史进行单词、搭配、短语和句型四个维度英语知识点使用频率的计算,对比学生文章与范文中各个知识点的使用频率得出推荐的知识点。在单词推荐模块,为学生推荐使用频率小于同主题范文的单词集合。在搭配、语块和句型推荐模块,对于学生没有使用过的知识点,为其推荐同主题范文中使用频率最高的知识点;对于学生使用过的知识点,为其推荐同主题范文中使用频率/学生使用频率最高的知识点。当学生历史文章数量不足时,通过级别判定和主题约束模块来选取替代文章,使用中国英语学习者语料(Chinese Learner English Corpus,CLEC)作为学生历史文章的替代语料。

测试显示,本系统推荐的知识点可以有效提升写作者的写作表现,知识点的有用性得到了被试者的一致认可。在系统测评时,首先邀请五位英语水平较高的人员进行了英语作文写作。让五位受邀者使用系统改写自己的文章,对改写前后的五篇文章分别进行了专家评分和计算机自动评分,并邀请写作者对系统推荐的知识点做了打分。结果显示,在采纳系统推荐的知识点进行文章修改后,人工评分平均提高了6.3%(满分9分制,平均提高0.568分),机器评分平均提高了0.5%(满分100分制,平均提高0.5分)。在对推荐的知识点进行人工评分时,在满分5分制下,单词推荐结果的平均人工评分为3.65,搭配推荐结果的平均人工评分为2.3,语块推荐结果的平均人工评分为3.63,句型推荐结果的平均人工评分为2.8。为了验证系统对英语水平一般的英语写作者的作用,笔者从CLEC语料库中随机抽取5篇文章,并邀请五位学生对这五篇文章进行严格基于系统推荐的知识点的改写,在修改后,人工评分平均提高了7.4%(满分9分制,平均提高0.67分),机器评分平均提高了8.5%(满分100分制,平均提高8.5分)。

外文摘要:

english writing is a major obstacle for chinese students either in exams or in daily life. although various books and websites concerning english writing exist, most of the books are simply the display of corpus resources while the auxiliary websites do nothing more than examing and grading student`s articles. these tools lack the individualized writing guidance which is key to the advance of students` writing ability.

among diverse teaching theories of english writing, one of the most recognized is improvement through rewriting, based on which this system is designed and developed. to get the suitable rewriting direction for each student, the system provides the users with knowledge points of words, collocations, chunks and sentence patterns. this system adjusts its feedback according to the . the main goal of an intelligent auxiliary writing platform is to boost students` writing ability through recommending suitable knowledge points. the knowledge points recommended in this project are those which chinese students rarely use while native writers use a lot.

this system aims to tackle the common problems chinese students have such as lost in words, incapable of connecting words into strong expressions and idiomatically insufficient. this thesis will give solution to these problems and verify the system`s effect. to use the system, students need to input an article, the system will then calculate and compare the usage frequency of different words, collocations, chunks and sentence patterns in user`s articles with that in native writer`s articles to decide which ones to recommend. when recommending words, the system picks those which students use less than native writers. when collocations, chunks and sentence patterns are selected, the system divides the strategy into two scenarios. if a user has never used certain knowledge points, those knowledge points who have the largest usage frequency in model essays are selected. for those knowledge points which have been used before by the user, the system calculates the value of usage frequency in model articles divided by usage frequency in students` articles and the knowledge point with the largest value is recommended. when there are not enough history articles of a user, the system employs level determination module and genre definition module on chinese learner english corpus (clec) for substitution.

the system is proved valid in promoting users`writing ability and knowledge points recommended are approved by the users. evaluation was done first by tracking the artificial scores (given by two english experts) and machine scores (given by pigaiwang, a website embedded with grading module) of five articles written by students. results show that after using the system, artificial scores increase by 6.3% (rose by 0.5 points of a possible 9) on average, machine scores increase by 0.5% (rose by 0.5 points of a possible 100) on average. when asked to evaluate the knowledge points recommended by the system with full mark of 5, the five writers scored the word recommendation module of 3.65, the collocation recommendation module of 2.3, the chunk recommendation module of and the sentence pattern module of 2.8. due to the fact that five writers invited by the author generally have high english writing levels. in order to test the system`s effect on average students, the author randomly extracted five clec articles and invite five students to rewrite them only using the knowledge points recommended by the system. after modification, artificial scores increase by 7.4% (rose by 0.67 points of a possible 9), machine scores increase by 8.5% (rose by 8.5 points of a possible 100).

分类号:

 H0-0    

论文总页数:

 64    

参考文献总数:

 58    

参考文献列表:
[1] friedman t l. the world is flat: a brief history of the twenty-first century.[j]. international journal, 2007, 9(1):67-69.
[2]《北京日报》 2012 北京外语人口十年增长近一倍,6月17日
[3] center d c. graduate record examination[m]. 1988.
[4] a snapshot of the individuals who took the gre revised general test
[5] vygotsky l s. mind in society: the development of higher psychological processes[m]. harvard university press, 1980.
[6] schmidt r, frota s n. developing basic conversational ability in a second language: a case study of an adult learner of portuguese[c]// r day talking to learn: conversation in second language acquisition rowley. 1980.
[7] ellis r. interpretation tasks for grammar teaching[j]. tesol quarterly, 1995, 29(1): 87-105.
[8] 梁彪. 面向英语智能学习的知识库系统的设计与实现 2018
[9] 赵恩辉. 英语智能写作个性化辅助系统的设计与实现 2018
[10] cotos e. designing an intelligent discourse evaluation tool: theoretical, empirical, and technological considerations[j]. developing and evaluating language learning materials, 2009: 103-127.
[11]徐昉. 英语写作教学与研究[m]. 外语教学与研究出版社, 2012.
[12] kepner, goring c . an experiment in the relationship of types of written feedback to the development of second-language writing skills[j]. the modern language journal, 1991, 75(3):305-313.
[13] wray a. formulaic language and the lexicon[m]. cambridge university press, 2005.
[14] 王立非, 张岩. 基于语料库的大学生英语议论文中的语块使用模式研究[j]. 外语电化教学, 2006(4):36-41.
[15] 郭晓英, 毛红梅. 语块教学对英语写作能力影响的实验研究[j]. 山东外语教学, 2010, 31(3):52-59.
[16] bloom b s. the 2 sigma problem: the search for methods of group instruction as effective as one-to-one tutoring[j]. educational researcher, 1984, 13(6):4-16.
[17] shute v j, psotka j. intelligent tutoring systems: past, present, and future[j]. handbook of research for educational communications and, 2002, 39(12):68.
[18] kingsbury g g, weiss d j. 13. a comparison of irt-based adaptive mastery testing and a sequential mastery testing procedure[m]// new horizons in testing. 1983:257-283.
[19] wainer h. computerized adaptive testing : a primer[m]. l. erlbaum associates, 2000.
[20] millán e, pérez-de-la-cruz j l. a bayesian diagnostic algorithm for student modeling and its evaluation[j]. user modeling and user-adapted interaction, 2002, 12(2-3):281-330.
[21] chung g k w k, o'neil jr h f. methodological approaches to online scoring of essays[j]. 1997.
[22] attali y, burstein j. automated essay scoring with e‐rater®; v.2.0[j]. journal of technology learning & assessment, 2006, 4(2):i–21.
[23] pawley, a. and syder, f.h. (1983) two puzzles for linguistic theory: native-like selection and native-like fluency. in: richards, j.c. and schmidt, r.w., eds., language and communi-cation, longman, new york: 191-226
[24] attali y. exploring the feedback and revision features of criterion[j]. national council on measurement in education (ncme), educational testing service, princeton, nj, 2004.
[25] cowie a p. multiword lexical units and communicative language teaching[m]// vocabulary and applied linguistics. 1992.
[26] 何克抗. 建构主义的教学模式、教学方法与教学设计[j]. 北京师范大学学报(社会科学版), 1997(5):74-81.
[27] fernandez-delgado m , cernadas e , barro s , et al. do we need hundreds of classifiers to solve real world classification problems?[j]. journal of machine learning research, 2014, 15:3133-3181.
[28] vapnik, v.n. and lerner, a.y., 1963. recognition of patterns with help of generalized portraits. avtomat. i telemekh, 24(6), pp.774-780.
[29] e.t.jaynes. 概率论沉思录[m]. 2009.
[30] 李航.统计学习方法.北京:清华大学出版社,2012
[31] david m. blei, andrew y. ng, and michael i. jordan. latent dirichlet allocation. j. mach. learn. res.,3:993–1022, march 2003.
[32] mikolov t, sutskever i, chen k, et al. distributed representations of words and phrases and their compositionality[c]// international conference on neural information processing systems. curran associates inc. 2013:3111-3119.
[33] 姜柄圭. 面向机器辅助翻译的汉语语块自动抽取研究[j].中文信息学报 2007
[34] nagao m, mori s. a new method of n-gram statistics for large number of n and automatic extraction of words and phrases from large text data of japanese[j]. proc.intern.conf.on computational linguistics, 1994, 1:611-615.
[35] 吕学强. 基于散列技术的快速子串归并算法[j]. 复旦学报(自然科学版), 2004, 43(5):948-951.[34] 谌贻荣. 中文术语自动提取技术研究 2005.
[36] 罗盛芬, 孙茂松. 基于字串内部结合紧密度的汉语自动抽词实验研究[j]. 中文信息学报, 2003, 17(3).
[37] shannon c e . a mathematical theory of communication[j]. bell labs technical journal, 1948, 27(4):379-423.
[38] 佚名. 超奇迹分类记18000英语单词[m]// 超奇迹 分类记18000英语单词. 2015.
[39] 韦晓亮, 刘剑. 雅思写作论证论据素材大全[m]. 浙江教育出版社, 2012.
[40] 佚名. 雅思词组必备[m]. 2012.
[41] 桂诗春, 杨惠中. 中国学习者英语语料库[m]. 上海外语教育出版社, 2003.
[42] röder m, both a, hinneburg a. exploring the space of topic coherence measures[c]//proceedings of the eighth acm international conference on web search and data mining. acm, 2015: 399-408.
[43] dan klein and christopher d. manning. 2003. accurate unlexicalized parsing. proceedings of the 41st meeting of the association for computational linguistics, pp. 423-430.
[44] petrov s , barrett l , thibaux r , et al. learning accurate, compact, and interpretable tree annotation[c]// international conference on computational linguistics & the meeting of the association for computational linguistics. association for computational linguistics, 2006.
[45] 项炜, 金澎. 大规模语料库上的stanford和berkeley句法分析器性能对比分析[j]. 电脑知识与技术, 2013(8).
[46] fellbaum c, miller g. combining local context and wordnet similarity for word sense identification[c]// 1998.
[47] 吕学强, 张乐, 黄志丹,等. 基于散列技术的快速子串归并算法[j]. 复旦学报(自然科学版), 2004, 43(5):948-951.
[48] quirk, c., c. brockett, and w. b. dolan. 2004. monolingual machine translation for paraphrase generation, in proceedings of the 2004 conference on empirical methods in natural language processing, barcelona spain.
[49] dolan w. b., c. quirk, and c. brockett. 2004. unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. coling 2004, geneva, switzerland.
[50] jeffrey pennington, richard socher, and christopher d. manning. 2014. glove: global vectors for word representation.
[51] frederiksen j r. a componential theory of reading skills and their interactions[j]. 1982.
[52] william grabe. current developments in second language reading research[j]. tesol quarterly, 1991, 25(3):375-406.
[53] nation p, waring r. vocabulary size, text coverage and word lists[j]. vocabulary: deion, acquisition and pedagogy, 1997, 14: 6-19.
[54] laufer b. the lexical profile of second language writing: does it change over time?[j]. relc journal, 1994, 25(2): 21-33.
[55] 崔艳嫣, 王同顺. 接受性词汇量、产出性词汇量与词汇深度知识的发展路径及其相关性研究[j]. 现代外语, 2006, 29(4):392-400.
[56] engber c a. the relationship of lexical proficiency to the quality of esl compositions[j]. journal of second language writing, 1995, 4(2): 139-155.
[57] 秦晓晴. 中国大学生英语写作能力发展规律与特点研究[m]. 中国社会科学出版社, 2007.
[58] 鲍贵. 二语写作中的词汇应用能力研究[m]. 外语教学与研究出版社, 2008.
公开日期:

 2019-06-04    

富信息古籍整理平台的设计与研究.刘晓娟

链接

题名:

 富信息古籍整理平台的设计与研究    

姓名:

 刘晓娟    

学号:

 1601210635    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 王雷    

导师1单位:

 外国语学院    

导师2姓名:

 高志军    

导师2单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-29    

外文题名:

 Research and Design of an Information-Rich Ancient Books Collation System    

关键词:

 古籍整理 古籍数字化 校勘 富信息    

外文关键词:

 Ancient books collation Ancient books digitalization Emendation Information-rich    

论文摘要:

古籍是辛亥革命以前传抄或刻印的历史典籍等资源的统称,具有较高的文物价值和 文化意义。但因年代久远,善本难存。为了恢复古籍的原本样貌,古籍工作者需要进 行辑佚、校勘、注释、标点等整理工作,以探求古籍原本样貌,便于后人阅读研究。

为了实现古籍资源共享,数字化是必由之路。如何借助先进的信息技术,提升古籍 整理效率,解决数字化过程中存在的问题是当务之急。通过分析古籍整理的研究现状, 可将问题总结为如下四点:一、古籍整理缺少功能完整、流程完善的开放平台;二、 缺乏统一规范的整理流程,整理工作欠缺指导;三、仅重视整理的结果,丢失整理的 过程信息。四、专业性较强的整理工作缺少专家参与,整理质量参差不齐。 为解决上述问题,以提供便捷、完整、高效的古籍整理系统为目标,结合古籍整理 的特点和原则,笔者创新性地提出了重视古籍整理过程的思路,并完成了多层次、可 追溯的古籍整理平台产品设计,为整理者提供了高效的工作环境。平台的优势可总结 如下:一、包含完整的工作流程,平台将版本选择、文本录入、内容整理等重点工作 囊括在内。在系统的指引下,整理者可通过一个平台完成整理任务,减少不同工具之 间的切换。二、重视整理过程,将系统分成不同的工作层次,借助富信息的设计,保 存每层的校改信息,根据存储的数据追溯整理过程,出现问题便于定位,及时更改, 也为研究提供了支持。三、区分专家和普通整理者角色,分别匹配不同的整理任务, 确保参与者能胜任整理工作,产出符合要求的整理成果;同时,系统分为多个层次, 可在每个层次审查整理结果,保证整理质量。

在北京大学儒藏古籍整理专家的指导下,笔者通过整理工作的典型场景应用范例, 对本研究设计的整理平台进行了验证。工作成果得到了古籍专家的肯定,证明了富信 息整理平台可以为古籍整理工作提供便利,提高整理结果的可信度。 

 

外文摘要:

Ancient books refer to the historical books written or published before the 1911 Revolution. Ancient books are the carriers of Chinese culture. Hundreds of years has passed since these books first came out, so they are inevitably suffered loss. To restore the original appearance of these historical books for reading and researching, it is necessary for the specialists to collate, add punctuation and notes, etc. 

On the purpose of widely sharing resources of ancient books, digitalization is the only way. In the past, scholars continually did research work with outdated tools. But nowadays, information technology brings more possibility to collation work. The collation work for historical books is developing through time. Though computer-aided collation systems help improve the efficiency of the work, there are still many problems waiting to be solved in collation practice as following listed:

Firstly, lack of open and integrated platform for collation work. Secondly, deficient in standard workflow and guidance. Thirdly, loss high value information due to the emphasis on the work result rather than the work process, workflow can not be traced back. Finally yet importantly, quality issues on collation work. 

To solve the problems mentioned above, the author proposed a novel idea of valuing collation process. Under the guidance of this idea, the paper designed an ancient books collation system with the following advantages:

Firstly, the system guides users to finish the whole process of collation work. Secondly, the system is designed by the guidance of emphasis on collation process, using XML file to record the data of collation process, which makes tracing the workflow back possible. Thirdly, the design of both user-work differentiated and multiple verification based on multi-layer design offers a guarantee of quality.

This paper has designed a specific collation scenario to validate the design of Information-Rich Ancient Books Collation System. Experts from the Ru Cang Compiling and Editing Center of Peking University had given their recognition. The interviews proved that this study and design could effectively lighten the burdens of collation work and improve the working efficiency to this field. 

分类号:

 TP3    

论文总页数:

 118    

参考文献总数:

 76    

参考文献列表:
[1] 周娟. 故宫文创对中国元素的运用研究[j]. 视听, 2018, no.130(02):198-199. [2] 曹亦冰. 从中国大陆当代古籍整理的现状看其类别、方式方法及走向[j]. 古籍整理研究学刊, 2005(1):1-7.
[3] 杨牧之. 新中国古籍整理出版工作回顾与展望[j]. 中国出版史研究, 2018(1). [4] 曹之. 中国古籍版本学. 第 3 版. 武汉: 武汉大学出版社. 2015.
[5] 毛建军. 古籍数字化的概念与内涵[j]. 图书馆理论与实践, 2007(4).
[6] 刘琳, 吴洪泽. 古籍整理学. 成都: 四川大学出版社. 2003.
[7] 李明杰, 俞优优. 中文古籍数字化的主体构成及协作机制初探[j]. 图书与情报, 2010(1):34-44.
[8] 曹玲. 农业古籍数字化建设实践. 芜湖: 安徽师范大学出版社. 2017.
[9] 许逸民. 古籍整理釋例. 北京: 中华书局. 2011.
[10] 肖珑, 苏品红, 刘大军. 国家图书馆古籍元数据规范与著录规则. 北京: 国家图书馆出版社. 2014.
[11] 黄永年. 古籍整理概论. 西安: 陕西人民出版社. 1985.
[12] 时永乐. 古籍整理教程. 保定: 河北大学出版社. 1997.
[13] 曹林娣. 古籍整理概论. 北京: 北京大学出版社, 2007.
[14] 刘琳, 吴洪泽. 古籍整理学. 成都: 四川大学出版社. 2003.
[15] 陈力. 中文古籍数字化方法之检讨[j]. 国家图书馆学刊, 2005, 14(3).
[16] 董洪利. 古典文献学基础. 北京: 北京大学出版社. 2008.
[17] 管锡华. 校勘學教程. 北京: 北京大学出版社. 2013.
[18] 胡适. 校勘學方法論: 序陳垣先生的元典章校補釋例. 北京: 出版者不详. 1934. [19] 陈桓. 校勘学释例. 北京: 中华书局. 1959.
[20] 刘伟红. 中文古籍数字化的现状与意义[j]. 图书与情报, 2009(4):134-137. [21] 彭江岸. 论古籍的数字化[j]. 河南图书馆学刊, 2000, 20(2):63-65.
[22] 王桂平. 我国古籍数字化的现状及展望[j]. 图书情报知识, 2000(4):50-1.
[23] 李运富. 谈古籍电子版的保真原则和整理原则[j]. 古籍整理研究学刊, 2000(1):1-7.
[24] 张尚英. 古籍电子化问题探析[j]. 安徽师范大学学报(人文社科版), 2002, 30(2):244-248.
[25] 李明杰. 中文古籍数字化基本理论问题刍议[j]. 图书馆论坛, 2005, 25(5):97-100.
[26] 李国新. 中国古籍资源数字化的进展与任务[j]. 大学图书馆学报, 2002, 20(1). [27] 陈力. 中国古籍数字化的现状与展望[j]. 古籍整理出版情况简报, 2004, 4. [28] 孙慧云 . 基于专利分析的古籍数字化技术演进研究[j]. 山东图书馆学刊, 2018. [29] 崔雷. 中文古籍数字化研究 [d]. 吉林大学. 2010.
[30] 王立清. 中文古籍数字化研究. 北京: 国家图书馆出版社. 2011.
[31] 曹霞, 常存库, 裴丽. 中医古籍数字化建设及其平台设计和实现[j]. 中华医学图书情报杂志, 2016, 25(3):45-47.
[32] long x, ling c. designing and implementation of chinese metadata standards: a case study on metadata applications in peking university rare book digital library[c]//global digital library development in the new millennium—fertile ground for distributed cross-disciplinary collaboration: proceedings of the 12 th international conference on new information technology. beijing: tsinghua university library, may 29-31. 2001.
[33] 孟忻. “中华字库”工程——中华民族有史以来规模最大的汉字及少数民族文字整理工作[j]. 中 国索引, 2013(1):43-44.
[34] 张翼飞. 古籍数字化中的字符集问题与解决方案[j]. 出版发行研究, 2016(3):77-80.
[35] 张轴材, 朱岩. 大规模文献数字化的实践与数字图书馆建设[j]. 高校文献信息研究, 2001(1):6-10.
[36] 馬德偉. tei 使用指南--運用 tei 處理中文文獻. 台北: 出版者不详. 2009 [37] 王魁生. 计算机支持协同设计系统的研究与实现[d]. 西安交通大学. 2001.
[38] de lima y o , de souza j m . [ieee 2017 ieee 21st international conference on computer supported cooperative work in design (cscwd) - wellington, new zealand (2017.4.26-2017.4.28)] 2017 ieee 21st international conference on computer supported cooperative work in design (cscwd) - the future of work: insights for cscw[c]// ieee international conference on computer supported cooperative work in design. ieee, 2017:42-47.
[39] papangelis k, potena d, smari ww, et al. advanced technologies and systems for collaboration and computer supported cooperative work. future generation computer systems. 2019;95:764-74.
[40] kittur a, nickerson j v, bernstein m, et al. the future of crowd work[c]//proceedings of the 2013 conference on computer supported cooperative work. acm, 2013: 1301-1318.
[41] tobias hoßfeld, hirth m , tran-gia p . modeling of crowdsourcing platforms and granularity of work organization in future internet[c]// teletraffic congress. ieee, 2011.
[42] choroś k, jarosz j. most frequent errors in digitization of polish ancient manus[c]//asian conference on intelligent information and database systems. springer, cham, 2018: 170-179.
[43] 辛睿龙, 王雅坤. 古籍数字化中汉字处理的现状、问题及策略[j]. 图书馆理论与实践, 2017(9).
[44] 李书宁, 曾姗. 国外图书馆数字馆藏众包建设实践调查与分析[j]. 图书情报工作, 2014, 58(23):83-90.
[45] 赵阳, 顾磊. 基于中文信息处理的古籍整理研究评述[j]. 图书情报工作, 2010, 54(3).
[46] robertson b, boschetti f. large-scale optical character recognition of ancient greek[j]. mouseion, 2017, 14(3): 341-359.
[47] grana c, serra g, manfredi m, et al. layout analysis and content enrichment of digitized books[j]. multimedia tools and applications, 2016, 75(7): 3879-3900.
[48] mehri m, gomez-kramer p, heroux p, et al. a texture-based pixel labeling approach for historical books [j]. 2017, 20(2): 325-64.
[49] mehri m, héroux p, lerouge j, et al. page retrieval system in digitized historical books based on error-tolerant subgraph matching[c]//2017 14th iapr international conference on document analysis and recognition (icdar). ieee, 2017, 1: 1168-1173.
[50] 常娥, 侯汉清, 曹玲. 古籍自动校勘的研究和实现[j]. 中文信息学报, 2007, 21(2):83-88.
[51] 黄建年. 古籍计算机自动断句标点与自动分词标引研究. 安徽师范大学出版社, 2013.
[52] 张开旭, 夏云庆, 宇航. 基于条件随机场的古汉语自动断句与标点方法 [j]. 清华大学学报(自 然科学版), 2009, 49(10): 1733-6.
[53] 赵宇波. 关于图书馆古籍整理人才培养问题的思考[j]. 科技视界, 2017(22). [54] 王国强. 图书馆古籍整理人才培养问题的思考[j]. 山东图书馆学刊, 2011(5):11-13.
[55] 高娟, 刘家真. 中国大陆地区古籍数字化问题及对策[j]. 中国图书馆学报, 2013, 39(4):110-119.
[56] 厉莉. 古籍数字化的现状及对策[j]. 江西图书馆学刊, 2002, 32(1).
[57] 范佳. “数字人文”内涵与古籍数字化的深度开发[j]. 图书馆学研究, 2013(3):29-32.
[58] 于亭. 计算机与古籍整理研究手段现代化[j]. 古汉语研究, 2000, 3: 66-70. [59] 朱小健. 古籍整理通用系统及其中字典的编纂[j]. 语言文字应用, 2000(3):99-103.
[60] 杜正民. 佛學數位資源的建置與開展[J]. 法鼓佛學學報, 2012(10):147-210
[61] 胡佳佳. 古籍数字化中基于关系的 xml 数据库[j]. 农业图书情报学刊, 2010, 22(2).
[62] 马创新, 陈小何. 基于本体和 xml 的注疏文献的结构化知识表示[j]. 图书馆杂志, 2017, 36(8):62-68.
[63] 邵正坤. 古籍数字化的困局及应对策略[j]. 图书馆学研究, 2014(12):32-34. [64] 何忠礼. 略论历史上的避讳[j]. 浙江大学学报(人文社会科学版), 2002, 32(1):82.
[65] 李致忠. 古书版本鉴定. 北京: 文物出版社, 1997.
[66] 魏隐儒 , 王金雨. 古籍版本鉴定丛谈. 北京: 中国社会科学出版社, 2017.
[67] 蓝永.论古籍整理的新方式——古籍数字化[d].山东:山东大学,2007.
[68] 陈明, 丁晓青, 梁健. 复杂中文报纸的版面分析、理解和重构[j]. 清华大学学报(自然科学版), 2001, 41(1).
[69] 靳从. 中文版面分析关键技术的研究[d]. 南京理工大学, 2007.
[70] 丁晓青, 王言伟. 文字识别: 原理、方法和实践 principles, methods and practice. 北京: 清华大 学出版社. 2017.
[71] 姜哲, 马少平, 夏莹. 大型中文古籍《四库全书》自动版面分析系统[j]. 中文信息学报, 2000, 14(2):14-20.
[72] 南京图书馆. 中國古籍善本書目索引. 上海: 上海古籍出版社. 2009.
[73] 李明杰. 数字环境下古籍整理范式的传承与拓新[j]. 中国图书馆学报, 2015, 41(5):99-110.
[74] 李元祥, 丁晓青, 刘长松. 一种基于噪声信道模型的汉字识别后处理新方法[j]. 清华大学学报: 自然科学版, 2001(1):24-28.
[75] 陳曦.中國文學領域古籍整理工作之研究. [d].國立中興大學圖書資訊學研究 所.2017.https://hdl.handle.net/11296/h89724
[76] 宋子然, 刘兴均. 中国古书校读法. 第 3 版. 成都: 巴蜀书社. 2004.
公开日期:

 2019-06-15    

公文辅助阅读平台的设计与实现.何寒松

链接

题名:

 公文辅助阅读平台的设计与实现    

姓名:

 何寒松    

学号:

 1601210540    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 王雷    

导师1单位:

 外国语学院    

论文答辩日期:

 2019-05-29    

关键词:

 公文 自然语言处理 辅助阅读    

论文摘要:

随着互联网的发展,公文已不仅用于政府部门内部传达信息,还开始融入人们的日常生活中,越来越多人通过“学习强国”APP、各类新闻APP的政治板块以及各地政府官网阅读和学习公文。高效的公文阅读不仅能够使公务员更快、更有效地进行公务的上传下达活动,还能使大众快速学习和领会到公文所传达的信息。然而公文的用语并不是人们日常生活中常常使用的口语,而是政府部门日常办公使用的书面语,其特有的准确性、简要性、平实性和规范性等特点导致公文文本的遣词造句都非常的严谨,这样一来公文对读者的教育文化程度和阅读理解能力都有相应的要求。为了让大多数人更容易读懂公文,需要建立有效的公文阅读平台来辅助读者阅读和理解公文。
笔者调研发现,公文阅读有以下两个问题没有得到有效解决:一是读者不易从公文长句中提取关键信息。普通人日常生活中最常用的句子的长度在2~15字左右,而公文句子的平均长度为40~50个汉字,主要是因为公文句子包含的语块较多,例如“社会主义核心价值观”、“两学一做”、“第十九次全国代表大会”等,而且公文中存在包含过多子句的复句。二是当前公文的展示没有突出公文的重点信息。目前公文的电子读物和纸质刊物都以公文的原始文本进行展示,这样的展示方式无法体现公文语言的特色,也不利于读者快速阅读和理解公文。
为解决上述问题,首先笔者从北京大学的规划智库网站上爬取了四千多万字的公文语料作为本文文本处理工作的研究对象,然后结合自然语言处理技术分析公文的文本特征,通过提取公文句子的关键信息和优化公文文章的展示效果,设计并实现了一个公文辅助阅读平台,从以下两个方面来辅助读者阅读和理解公文。一方面,为了提取公文句子的关键信息,本文需要识别出公文句子的语块和主干成分。此外,由于公文句子长度过长,如果用词作为句子的组成单位,读者需要将组成句子的每个词逐个输入脑海组成语块再组成句子,这样会增加读者的短时工作记忆容量,因此本文使用语块来作为句子的组成单位,分别从公文的多字词、组织机构名、四字格短语、文件名和新概念入手,使用规则和统计的方法将这几类语块从句子中抽取出来,其中多字词识别的准确率达到了92.91%、组织机构名识别的准确率为82.88%,组织机构名中的会议名称识别的准确率为87%;公文中复句较多,而且复句常常包含多个子句,因此本文首先将公文的复句拆解为多个子句;同时,本文使用最大熵模型构建句子主干识别模型来识别公文的句子主干,其中例文的主谓宾三个主干成分的准确率都在90%以上。另一方面,本文还利用上述自然语言处理方法分析出的结果对公文进行长句的优化展示研究,以帮助读者更快速有效的理解复杂公文长句。本文首先使用Flask搭建公文辅助阅读平台,将识别出的语块做标记展示,然后通过对段间、句间、子句间做不同的间隔距离处理,来优化公文的展示效果,最后通过抽取文章的章节标题并设计文章的侧边导航,让读者能够快速获得整篇公文的主题脉络,同时提高文章内信息检索的效率。
在本研究的最后,笔者邀请了10位同学完成本文设计的公文阅读理解任务,对比使用公文辅助阅读平台前后的测验成绩结果,结果表明本平台确实对公文阅读起到了一定的帮助。
本文研究和实现的公文辅助阅读平台具有一定的实用价值和研究价值。对于公文读者来说,本文设计的平台有利于提高公文的阅读效率;对于将来从事公文阅读优化工作的工作者来说,自动标记公文的语块和主干成分有利于减少人工标记的工作量。本文对公文语体在技术上的探索和研究,能为将来使用自然语言处理技术研究公文的研究者提供一定的借鉴,也能为其他领域文章的辅助阅读研究做参考。

 

分类号:

 TP3    

论文总页数:

 84    

参考文献总数:

 40    

参考文献列表:
狄颖. 中文多词表达抽取研究[D].南京师范大学,2013.
何先友,莫雷.国外文章标记效应研究综述[J].心理学动态,2000(03):36-42.
何先友,莫雷.文章主题的组织方式对文章标记效应的影响[J].心理发展与教育,2000(03):25-29.
胡玉溪. 基于双语语料的汉语多词表达抽取[D].北京邮电大学,2011.
黄自然. 以“字”为单位的汉语平均句长与句长分布研究[J].齐齐哈尔大学学报(哲学社会科学版),2018(01):133-138.
贾光茂,杜英.汉语“语块”的结构与功能研究[J].暨南大学华文学院学报,2008(02):64-70.
金勇.公文四字格短语运用浅析[J].中国西部科技,2008(18):87-88.
李航.统计学习方法[M].北京:清华大学出版社,2012:80-81.
李音, 戴卫平.语块理论与语块教学[J].现代语文:下旬.语言研究,2012,(12):22-24.
刘森,刘莎,王中雨.从理论角度分析影响阅读的因素[J].边疆经济与文化,2011(03):138-139.
刘婷,詹宏伟.陌生语块突显对其附带习得及文本理解的影响[J].湖北第二师范学院学报,2011,28(03):29-33.
刘子群. 书籍版式设计中字体排版的应用研究[D].江南大学,2007.
陆丙甫,蔡振光.“组块”与语言结构难度[J].世界汉语教学,2009,23(01):3-16.
吕英. 文章结构标记、呈现方式对学生认知负荷的影响[D].河南大学,2008.
吕子静.浅谈公文语言的特点与要求[J].陕西青年管理干部学院学报,2006(03):38-40
毛奇,连乐新,周文翠,袁春风.基于标点符号分割的汉语句法分析算法[J].中文信息学报,2007(02):29-34.
缪苗. VNC结构多词表达的抽取与分类[D].北京邮电大学,2011.
亓文香.语块理论在对外汉语教学中的应用[J].语言教学与研究,2008(04):54-61.
孙鹏程.基于语料库的现代行政公文句式考察[J].皖西学院学报,2018,34(01):130-134.
孙启高. 基于语料库的公文缩略语知识挖掘研究[D].山东大学,2014.
谭文堂. 基于统计模型的汉语句子主干分析[D].国防科学技术大学,2008.
汪春红. 汉语并列关系复句中的决策式依存句法分析与研究[D].华中师范大学,2016.
王蕾. 基于统计方法的汉语长句依存句法分析[D].中国海洋大学,2009.
徐润华,陈小荷,李斌.分词语料库中的并列式四字格识别[J].计算机工程与应用,2010,46(04):139-141.
徐润华,曲维光,陈小荷,王东波.多语料库中汉语四字格的切分和识别研究[J].中文信息学报,2013,27(05):15-21+42.
阎国利,白学军.中文阅读过程的眼动研究[J].心理学动态,2000(03):19-22.
杨振鹏. 中文多词表达抽取及其在依存句法分析中的应用[D].南京师范大学,2015.
叶起昌.超链接的导航和语义功能——解读超链接文本不可缺失的环节[J].北京交通大学学报(社会科学版),2007(03):103-107.
张硕. 基于语料库的2012年度党政机关公文词频分析[D].暨南大学,2013.
张小衡,王玲玲.中文机构名称的识别与分析[J].中文信息学报,1997(04):22-33.
张艳丽. 中文机构名称的自动识别[D].大连理工大学,2003.
赵国俊.电子政务教程[M].北京:中国人民大学出版社,2004: 62.
郑雪莹. 不同文本突显方式对于泛读中词汇附带习得的影响[D].上海师范大学,2015.
宗成庆.统计自然语言处理[M].北京:清华大学出版社,2013:23-26,122,125.
邹小阳.公文语体中的新四字格探析[J].湖南科技学院学报,2008(05):202-204.
Finkel JR, Grenager T, and Manning CD. 2005. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling, In: Proceedings of ACL, pages 363-370.
Leaman R and Gonzalez G. 2008. BANNER: An Executable Survey of Advance in Biomedical Named Entity Recognition. Pacific Symposium on Biocomputing, 13:652-663.
Nadine Marcus, Martin Cooper& John Sweller. Understanding Instructions Journal of Educational Psychology, 1996, 88(1):49-63.
TinkerMA Recent studies of eye movement sinreading Psychological Bulletin, 1958, 58 (4) :2 15 - 2 31
Wray, A.Formulaic Language and the Lexicon[M]. Cambridge:Cambridge University press,2002.
公开日期:

 2019-06-04    

多功能古籍协同研究平台的研究与设计.邓娟

链接

题名:

 多功能古籍协同研究平台的研究与设计    

姓名:

 邓娟    

学号:

 1601210497    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 王雷    

导师1单位:

 外国语学院    

论文答辩日期:

 2019-05-29    

关键词:

 古籍研究 古籍整理 协同研究    

论文摘要:

古籍文献是古籍专家必备的研究资料,也是文化传播的重要载体。传统的古籍研究 以纸质资源为载体,受时间、空间等条件的限制比较多,存在着资源受限、研究效率 不高等问题;而目前的古籍数字化平台在功能、内容呈现模式等方面存在不足,难以 发挥实际作用,要提高研究效率、传播传统文化,需建立有效的研究与阅读平台。 笔者调研发现,以下四个问题暂未得到有效解决。一、对读者来说,阅读古代典籍 时存在着字词、句子篇章和古代常识的理解障碍,而现有的古籍平台在提供古籍资源 时并未解决此问题,不便于传统文化的传播。二、现有的古籍数字化平台未能充分挖 掘文献内容,缺乏文献之间的关系,比如文献之间的引用关系、文献不同版本之间的 关系;更进一步,古籍文献中的人名、地名等专有名词,未与相关材料形成知识网络, 比如缺乏人物与其别名、官职、作品、生平资料等内容的联系。三、未能充分支持研 究者之间的交流与合作,研究者的研究想法与研究成果难以共享。四、缺乏研究支持 工具,比如相似句段发现、对比等。 为解决上述问题,本研究以古籍整理研究理论和方法为依据,结合古籍文献特点, 利用数字化平台的优势,设计了一款多功能古籍协同研究平台,从以下四个方面优化 平台的辅助作用。一、提供字典、词典及参考资料,建立人物传记资料、地理资料等 知识库辅助阅读,并提供注释、翻译等内容解决阅读障碍。二、根据古籍文献特点, 提供专名标注、引书挖掘等数据加工工具,在计算机辅助的情况下,建立文献知识之 间的联系,形成知识网络;三、首先以“研究项目”的资源组织形式,构建研究课题、 研究文献、研究者之间的联系;然后,通过在线批注与编辑功能,实现在线交流与协 作研究;四、集成辅助工具支持研究工作,提高研究效率,如相似句段、文字对比、 典籍引用、引用挖掘等。 最后,本研究以邀请古籍专家试用的方式,以古籍研究实例验证了设计方案的有效 性和合理性。验证结果表明本文提供的协同研究环境与文献标注等工具有效辅助研究 者开展研究工作,提高研究效率。
 
 

外文摘要:

Ancient books are essential research materials for experts and important carriers of cultural transmission. Traditional research resources are mainly paper materials, which are limited by time and space, thus experts are confronted with limited research materials and low research efficiency. In order to solve these problems, digital platforms appear and play significant roles in ancient books research, but many shortcomings exist in these systems in terms of functions and content presentation. In this context, this paper aims to design a platform providing experts in ancient books research with a collaborative online research environment, and ordinary readers with a user-friendly reading environment. By studying existing platforms and experts, the author finds out that the following four problems have not been effectively resolved. First, language barriers and lack of ancient common sense are major obstacles to read ancient texts for modern Chinese people. However, current platforms have not solved these problems when providing resources and functions, so they cannot popularize traditional culture with significant effect. Second, the existing platforms fail to fully exploit the content of the literature or link cited documents or different versions of the literature. Furthermore, these platforms are not able to form a knowledge network to connect the proper names such as the names of people and places in the ancient books within related contents. Third, due to the lack of collaborative online research environment, researchers are unable to share research ideas and results with other people. Fourth, there are no research support tools such as similar segments and citation marks. In order to solve the above problems, based on the theory and methods of ancient books research, combining the characteristics of ancient works, and taking advantages of digitalization, this study designs a multifunctional platform for ancient books collaborative research, which plays an auxiliary role in the following four aspects. First, the platform establishes a knowledge base of biographical and geographic materials to help reading and provides annotations, translation and other content to break down reading barriers. Second, considering the characteristics of ancient books, this paper provides data processing tools such as name tagging and quotation mining, and links related contents to establish a knowledge network. Third, the study organizes research topics, literature and researchers by “research project”, and then establishes a communication and collaborative platform with online annotation and editing functions. Fourth, the online system integrates similar segments, text comparison, automatic annotation and other tools to assist research work, improving research efficiency. At last, this paper verifies the validity and feasibility of the design by user testing and case study. According to the feedback of experts in ancient books research, this design complements limited functions of current platforms, helps experts carry out research work and promotes ancient books to the public.
 
 

分类号:

 TP3    

论文总页数:

 83    

参考文献总数:

 86    

参考文献列表:
[1] 童志斌. 关于普通高中语文教材中文言课文编排与教学的思考——7种版本高中语文新课标必修教材比较[J]. 教育理论与实践, 2009(17):17-19.
[2] 王志凯. 中学文言文教学目的研究[硕士学位论文]. 浙江: 浙江师范大学, 2002.
[3] 朱易安. 古籍整理研究面临的困难和出路之我见[J]. 古籍整理研究学刊, 1999(6):3-4.
[4] 袁林. 中国古代史研究数字化文献资源与利用[J]. 中国史研究动态, 2000(12):19-27.
[5] 王纯. 古籍数字化之趋势[J]. 济宁学院学报, 2000(3):50-51.
[6] 李运富. 谈古籍电子版的保真原则和整理原则[J]. 古籍整理研究学刊, 2000(1):1-7.
[7] 彭江岸. 论古籍的数字化[J]. 河南图书馆学刊, 2000, 20(2):63-65.
[8] 乔红霞. 关于古籍全文数据库建设工作的思考[J]. 河南图书馆学刊, 2001, 21(4):58-60.
[9] 毛建军. 古籍数字化的概念与内涵[J]. 图书馆理论与实践, 2007(4).
[10] 毛建军. 古籍数字化理论与实践[M]. 北京: 航空工业出版社, 2009.
[11] 高娟, 刘家真. 中国大陆地区古籍数字化问题及对策[J]. 中国图书馆学报, 2013, 39(4):110-119.
[12] 刘家真. 馆藏文献数字化的原则与方法(下)[J]. 中国图书馆学报, 2001, 27(6).
[13] 牛红广. 关于古籍数字化性质及开发的思考[J]. 图书馆, 2014(2).
[14] 王立清. 关于多元古籍数字化主体的探讨[J]. 图书馆学研究, 2011(7):53-58.
[15] 吴夏平. 古籍数字化与学术研究[J]. 贵州师范学院学报, 2007, 23(6):69-72.
[16] 陈诚.论古典文献数字化[硕士学位论文]. 苏州: 苏州大学, 2004.
[17] 胡石, 肖莉杰. 新媒体环境下的古籍阅读模式研究[J]. 图书馆学研究, 2012(19):78-81.
[18] 曹林娣. 古籍整理概论[M]. 北京: 北京大学出版社, 2007.
[19] 王玉良. 略谈我国古代文字的载体及书籍的起源[J]. 中国图书馆学报, 1993(2):78-82+98.
[20] 黄永年. 古籍整理概论[M]. 上海: 上海书店出版社, 2013.
[21] 许威汉. 训诂学读本[M]. 上海交通大学出版社, 2010.
[22] 徐春波.中医古籍文献的多层次结构探析[C].//医论集锦.山东中医药大学,2005:86-91.
[23] 马创新, 陈小荷. 基于本体和XML的注疏文献的结构化知识表示[J]. 图书馆杂志, 2017, 36(8):62-68.
[24] 刘琳, 吴洪泽. 古籍整理学[M]. 四川: 四川大学出版社, 2003.
[25] 李明杰. 数字环境下古籍整理范式的传承与拓新[J]. 中国图书馆学报, 2015, 41(5):99-110.
[26] 汪耀楠. 注释学纲要[M]. 北京: 语文出版社, 1991.
[27] 时永乐. 古籍整理教程[M]. 河北: 河北大学出版社, 2003.
[29] 潘树广. 论古代文学研究中的文献学方法[J]. 常熟高专学报, 1999(1):58-62+75.
[30] 管锡华. 论注释与训诂和古籍整理研究的关系[J]. 合肥师范学院学报, 1994(2):58-62.
[31] 纪健生. 厚积薄发 金针度人——读吴孟复《古籍研究整理通论》[J].古籍研究, 1998(2):104-111.
[32] 姜亮夫. 整理与研究异同辨——有关古籍整理研究若干问题之一[J]. 文史哲, 1984(6):81-85.
[33] 徐国庆.现代汉语词汇系统论[M]. 北京: 北京大学版社,1999年,第184-187页.
[34] 陆宗达, 王宁. 训诂与训诂学[M]. 山西: 山西教育出版社, 1994.
[35] 崔文印. 关于古籍整理的一些问题[J]. 史学史研究, 1985(1):21-28.
[36] 吕叔湘. 南北朝人名与佛教[J]. 中国语文,1988年第4期.
[37] 辛志贤. 《左传》地名考辨[J]. 北京师范大学学报:社会科学版, 1996(3):20-27.
[38] 徐志明. 程甲本《红楼梦》回目之人物称呼统计研究[J]. 剑南文学(下半月), 2011(11):75-76.
[39] 何凌霞. 《三国志》专名研究[博士论文]. 复旦大学, 2009.
[40] 胡道静. 叶廷珪和《海录碎事》[J]. 辞书研究, 1990, 1990(1):107-115.
[41] 王映予. 宋代类书《海录碎事》研究[博士学位论文]. 兰州: 兰州大学, 2017.
[42] 彭婵娟. 《玉海·艺文》所引宋代文献研究[硕士学位论文]. 广西: 广西师范大学, 2016.
[43] 衡中青. 地方志知识组织及内容挖掘研究——以《方志物产·广东》为例[博士学位论文]. 江苏: 南京农业大学, 2007.
[44] 张明. 刘孝标《世说新语注》引书研究——经、子、集三部[博士学位论文]. 吉林: 东北师范大学, 2009.
[45] 李文娟. 《太平御览》引《论语》考[硕士学位论文]. 山东:曲阜师范大学, 2014.
[46] 刘跃进. 《玉海·艺文》的特色及其价值[J]. 复旦学报(社会科学版), 2009(4):38-42.
[47] 赵逵夫. 校读法的概念、范围与条件[J]. 古籍整理研究学刊, 2007(3):1-4.
[48] 宋子然. 中国古书校读法[M]. 四川: 巴蜀书社, 1995.
[49] 时永乐, 门凤超. 古籍版本学的研究内容[J]. 图书馆理论与实践, 2008(4).
[50] 梁岳标. 郑藏本的渊源与流变[C]. 红楼梦研究//红迷会仪征分会. 2018:19-36.
[51] 陈爱志. 数字化古籍对古籍整理与研究的影响[J]. 中华医学图书情报杂志, 2011, 20(1):18-20.
[52] Lunin L F , Rada R . Perspectives on. Hypertext: Introduction and Overview.[J]. Journal of the American Society for Information Science, 1989, 40.
[53] Destefano D , Lefevre J A . Cognitive load in hypertext reading: A review[J]. Computers in Human Behavior, 2007, 23(3):1616-1641.
[54] 常娥. 古籍智能处理技术研究——农业古籍自动编纂和自动校勘的研究[博士学位论文]. 江苏:南京农业大学, 2007.
[55] Scheiter K , Gerjets P . Learner Control in Hypermedia Environments[J]. Educational Psychology Review, 2007, 19(3):285-307.
[56] Greif I. Computer-supported cooperative work : a book of readings[J]. Communications of the Acm, 1988:0180.
[57] 史美林. 计算机支持的协同工作理论与应用[M]. 北京: 电子工业出版社, 2000.
[58] 颜运梅. 众包在国内古籍数据库建设中的应用研究[J]. 图书馆研究, 2016(5):30-34.
[59] 辛睿龙, 王雅坤. 古籍数字化中汉字处理的现状、问题及策略[J]. 图书馆理论与实践, 2017(9):103-107.
[60] Kittur A, Nickerson J V, Bernstein M, et al. The future of crowd work[J]. Social Science Electronic Publishing, 2013, 263(1):1301-1318.
[61] 马创新, 陈小荷, 曲维光. 注疏文献中的注释语句自动分析[J]. 计算机科学, 2012,
39(10):220-223.
[62] 马创新, 陈小荷, 曲维光. 经典古籍注疏文献的知识网络研究与设计[J]. 图书情报工作,
2013(9):124-128.
[63] 白振田, 衡中青, 侯汉清. 地方志引书挖掘系统的设计与实现[J]. 图书馆杂志, 2008,
27(8):50-54.
[64] 汤亚芬. 先秦古汉语典籍中的人名自动识别研究[J]. 数据分析与知识发现, 2013, 29(7/8):63-68.
[65] 朱锁玲. 命名实体识别在方志内容挖掘中的应用研究——以广东、福建、台湾三省《方志物产》为例[博士学位论文]. 江苏: 南京农业大学, 2011.
[66] 朱积孝. 试谈古籍的情报价值与开发利用[J]. 图书馆工作与研究, 1995(4):12-15.
[67] 郑永晓. 古籍数字化与古典文学研究的未来[J]. 文学遗产, 2005(5):130-137.
[68] 常娥, 黄建年, 侯汉清. 古籍智能整理与开发系统构建研究[J]. 情报资料工作, 2009(4):43-47.
[69] 赵新. 从《儒藏》精华编看古籍数字化的价值理念与技术前景[J]. 现代出版, 2016(2):31-33.
[70] 朱小健. 古籍整理通用系统及其中字典的编纂[J]. 语言文字应用, 2000(3):99-103.
[71] 陈力.中文古籍数字化方法之检讨[J].国家图书馆学刊,2005,14(3):11-16.
DOI:10.3969/j.issn.1009-3125.2005.03.003.
[72] 汪毅夫. 《台海击钵吟集》史实丛谈——兼谈台湾文学古籍研究的学术分工[J]. 福建师范大学学报(哲学社会科学版), 2007(1):47-52.
[73] 于亭. 计算机与古籍整理研究手段现代化[J]. 古汉语研究, 2000(3):66-70.
[74] 王兆鹏. 古籍文献的检索工具书概述[J]. 古典文学知识, 2003(2):98-107.
[75] 曹书杰. 古籍中人物字号、别名的查考[J]. 古籍整理研究学刊, 1989(4):46-49.
[76] 史睿. 论中国古籍的数字化与人文学术研究[J]. 国家图书馆学刊, 1999(2):28-35.
[77]于亭. 略谈计算机古籍资料库建设[J]. 古籍整理研究学刊, 1999(6):11-12.
[78] 童强. 从注疏之学看唐代学术思想的发展[J]. 江海学刊, 2002(4).
[79] 李亦茹. 试论清代注疏体文献的检索功能[J]. 图书馆理论与实践, 2004(6).
[80] 葛志毅. 史学方法论与传统考据学[J]. 学习与探索, 1990(1):123-132.
[81] 杨士首. 古汉语同实异名现象的产生[J]. 辽宁大学学报(哲学社会科学版), 1991(5):72-74.
[82] 尚永亮. 数据库、计量分析与古代文学研究的现代化进程[J]. 文学评论, 2007(6):187-190.
[83] 施吕彦. 《现代计量学概论》, 北京: 中国计量出版社, 2003.
[84] Michelk J B , Shen Y K , Aiden A P , et al. Quantitative Analysis of Culture Using Millions of Digitized Books[J]. Science, 2010, 331(6014):176-182.
[85] Torget A J , Mihalcea R , Christensen J , et al. Mapping Texts: Combining Text-Mining and Geo- Visualization To Unlock The Research Potential of Historical Newspapers[J]. Unt Scholarly Works, 2011.
[86] JohnSinclair. Corpus, concordance, collocation = 语料库、检索与搭配[M]. 上海: 上海外语教育出版社, 1999.
公开日期:

 2019-06-17    

2019-05-27

大学英语写作学习平台游戏化设计研究与实践.戴欣怡

链接

题名:

 大学英语写作学习平台游戏化设计研究与实践    

姓名:

 戴欣怡    

学号:

 1601210495    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 俞敬松    

导师1单位:

 信息科学技术学院    

论文答辩日期:

 2019-05-27    

关键词:

 大学英语写作 游戏化学习 产品设计    

论文摘要:

      随着国际化进程的不断发展,人们越来越重视外语能力的培养,尤其是在真实情境中的语言运用能力,其中英语写作能力占据了非常重要的位置。但由于国内学生的外语基础较为薄弱,能够实际操练的机会少,且外语写作水平的提升并非一蹴而就,因此写作成为了国内外语学生的弱项,同时也成为了他们焦虑和畏惧的对象。
      随着在线学习的日益普及和游戏化概念的推广,游戏化学习的方法越来越多地应用在各类在线英语学习系统中。但现有的在线写作学习系统仅停留在写作练习本身,没有考虑到学生在写作学习过程中的心理状态与变化,既无法对他们起到激励作用,也无法缓解他们的焦虑感。
      本文基于游戏化写作学习理论、动机激励理论和焦虑理论,分析了现有相关产品在游戏化设计上的优势与不足,针对目标学生群体的心理需求和现有在线学习游戏化设计存在的问题完成大学英语学习平台的游戏化设计。通过游戏、排名、奖励等机制提升学生在学习过程中的获得成就感与满足感,通过明确阶段性学习目标、营造轻松的学习氛围降低学生的焦虑感,通过多维度的学习行为记录与评价机制给予学生更为客观的评价。
      由于系统中涉及的游戏化元素和机制较多且错综复杂,本文从工作成果中选择了比较有代表性的课程地图、积分系统和排行榜三个游戏化模块进行了详细的研究和探讨,针对各个模块提出了不同的设计方案。此外,还简单介绍了本系统中勋章、任务系统以及基于游戏化的学生评价方法的设计思路与方案。
      基于以上工作,本研究选取了40 名非英语专业大学二年级学生进行对照教学实验,通过实验观察以及实验后的问卷调查、数据分析和访谈分别论证了课程地图、积分系统和排行榜模块不同的设计方案,包括帮助学生明确学习目标,给予他们正向激励和成就感,帮助外部动机内化提升学习积极性,以及树立学生的信心、降低写作学习过程中的焦虑感等方面的设计有效性,并针对若干疑难筛选出了各个模块的最佳设计方案。
       本研究中大学英语写作平台游戏化设计填补了在线英语写作游戏化学习方面的空白,有效地提升了学生在写作学习过程中的胜任感、自主感和归属感,缓解了其面对写作学习时恐惧和焦虑的情绪,同时提出了一种基于游戏化的学生评价机制,补足了现有英语游戏化学习设计中缺乏形成性评估机制的短板,对英语写作移动教学和游戏化学习具有一定的参考价值。

外文摘要:

     with the continuous internationalization process, people pay more and more attention to the foreign language competence, especially the ability to use the language in real life, in which writing ability plays a crucial role. however, since writing skills could not be improved overnight, most chinese students, who have limited knowledge of english and lack the opportunity to apply it into practice, are frustrated when they write in english.
    with the growing popularity of online learning and the spread of gamification, gamification learning methods are increasingly used in various online english learning systems. unfortunately, existing online writing learning systems only focus on the writing
practice itself but fail to take into account students' everchanging psychological state in the process of writing and learning. consequently, these systems can neither motivate learners nor relieve their anxiety.
    based on gamification writing learning theory, motivation theory and anxiety theory, this paper analyzes the advantages and disadvantages of existing competing products and its counterparts in terms of their gamification design, and aims to design an english learning platform which caters for the psychological needs of university students. learners' sense of achievement as well as their satisfaction are enhanced by games, ranking, rewards and other mechanisms in the learning process. meanwhile, by illustrating learning ives in various stages and creating a relaxing learning atmosphere, they would experience less anxiety.
moreover, students would receive more ive evaluation which is based on multidimensional learning behavior.
    because of the complexity of numerous gamification elements and mechanisms involved in the system, this paper selects three representative component, namely the learning map, score systems and ranking list, for detailed research and discussion. different designs are proposed for each of them. in addition, design ideas and schemes of the medals, task systems and evaluation methods based on gamification are briefly introduced.
    based on the above work, this study selected 40 non-english majors in their second year for the comparative experiments. various designs for learning map, score systems and ranking
list are analyzed with the help of experimental observation, post-experimental questionnaire survey, data analysis and interviews. the designs includes helping students to set clear learning goals, providing motivation as well as a sense of accomplishment, and guiding students to internalize motivations to increase their enthusiasm, as well as building their confidence while reducing their anxiety in writing and learning. the best design for each component is proposed.
    in this study, the gamification design of the college english writing platform fills the blank of the online english writing learning, which effectively enhances students' confidence as well
as their learner autonomy and the sense of belonging in the process of learning how to write. it also alleviates their anxiety and fear during the learning. the evaluation mechanism which is based on gamification compensate the lack of formative assessment in the existing english gamification learning design. it has certain reference value for online english writing learning and gamification learning.

分类号:

 TP3    

论文总页数:

 60    

参考文献总数:

 44    

参考文献列表:
艾瑞咨询(iResearch). 中国移动游戏行业研究报告[EB/OL]. [2018.07]. http://report.iresearch.cn/wx/report.aspx?id=3266.
鲍雪莹, 赵宇翔. 游戏化学习的研究进展及展望[J]. 电化教育研究, 2015, 36(8): 45-52.
贝晓越. 写作任务的练习效应和教师反馈对不同外语水平学生写作质量和流利度的影响[J]. 现代外
语(季刊), 2009, 32:4:389-398.
蔡慧萍,方琰. 英语写作教学现状调查与分析[J]. 外语与外语教学, 2006, 9:21-24.
蔡兰珍. “任务教学法”在大学英语写作中的应用[J]. 外语界, 2001, 4:6:41-46.
曹荣平, 张文霞, 周燕. 形成性评估在中国大学非英语专业英语写作教学中的运用[J]. 外语教学,
2004, 5:82-87.
池丽萍, 辛自强. 大学生学习动机的测量及其与自我效能感的关系[J]. 心理发展与教育, 2006,
22(2):64-70.
郭燕, 秦晓晴. 中国非英语专业大学生的外语写作焦虑测试报告及其对写作教学的启示[J]. 外语
界, 2010, (2):54-62.
简·麦戈尼格尔. 游戏改变世界:游戏化如何让现实变得更美好[M]. 北京:北京联合出版公司, 2016.
凯文·韦巴赫, 丹·亨特. 游戏化思维:改变未来商业的新力量[M]. 浙江:浙江人民出版社, 2014.
兰良平, 韩刚. 英语写作教学——课堂互动性交流视角[M]. 北京:外语教学与研究出版社, 2014.
李航. 大学生英语写作焦虑和写作成绩的准因果关系:来自追踪研究的证据[J]. 外语界, 2015,
(3):68-75.
李炯英, 李青. 我国外语焦虑研究:回顾与反思——基于外语类期刊近十年 (2006—2015) 论文的
统计分析[J]. 外语界, 2016, 4:58-65.
刘梅华. 论低自信和课堂表现焦虑对大学生英语学习的影响:交叉滞后研究[J]. 外语教学, 2011,
32(5): 43-47.
唐丽洁. 国内十年游戏化学习研究现状与分析[J]. 中国教育信息化, 2015, (10):23-25.
王庆, 钮沐联, 陈洪, 等. 国内教育游戏研究发展综述[J]. 电化教育研究, 2012, (1): 82-84, 89.
文秋芳. 英语学习策略论[M]. 上海:上海外语教育出版社, 1996.
吴庆麟, 胡谊, 朱晓红. 教育心理学[M]. 上海:华东师范大学出版社, 2018.
徐杰, 杨文正, 李美林, 等. 国际游戏化学习研究热点透视及对我国的启示与借鉴——基于
Computers & Education (2013-2017) 载文分析[J]. 远程教育杂志, 2018, 36(6), 73-83.
中国音数协游戏工委(GPC), CNG 中新游戏研究(伽马数据), 国际数据公司(IDC). 2018 中国
游戏产业报告:摘要版[M]. 北京:中国书籍出版社, 2018.
Black P., &William D. Inside the Black Box: Raising Standards Through Classroom Assessment[M].
Granada Learning, 2005.
Black P, & William D. Developing the Theory of Formative Assessment [J]. Educational Assessment,
Evaluation and Accountability, 2009, 21(1): 5-31.
Bloom B S, Hastings J T, Madaus G F. Handbook on Formative and Summative Evaluation of Student Learning [M]. New York: MacGraw-Hill, 1971.
Boud D. Assessment and learning: contradictory or complementary[J]. Assessment for learning in higher
education. 1995:35-48.
Broadfoot P, Daugherty R, Gardner J, et al. Assessment for Learning: Beyond the Black Box [M].
Cambridge, England: University of Cambridge School of Education, 1999.
Buck G A, Trauth-Nare A E. Preparing teachers to make the formative assessment process integral to science
teaching and learning[M]. Journal of Science Teacher Education, 2009, 20(5):475-494.
Carless D. Learning-oriented Assessment: Conceptual Bases and Practical Implications [J]. Innovations in
Education & Teaching International, 2007, 44(1): 57-66.
Clarke-Midura J & Groff J. Formal Game-Based Assessments: The challenge and opportunity of building
next generation assessments[C]. Madison: Games+Learning+Society 8.0, 2012.
Cowie B, Bell B. A Model of Formative Assessment in Science Education [J]. Assessment in Education:
Principles, Policy & Practice, 1999, 6(1): 101-116.
Davison C, & Leung C. Current Issues in English Language Teacher-based Assessment [J]. TESOL
Quarterly, 2009, (43): 393-415.
Hattie J, Timperley H. The Power of Feedback [J]. Review of Educational Research, 2007, 77(1): 81-112.
Horowitz D. Process not product: less than meets eyes [J]. TESOL Quarterly. 1986, 20(1):141-4.
KAPP Karl M. Games, gamification, and the quest for learner engagement[J]. T+ D, 2012, 66(6):64-68.
Kapp Karl M. The gamification of learning and instruction fieldbook: Ideas into practice [M]. John Wiley
& Sons, 2013.
Kim Yoon Jeon, Valerie J Shute. The interplay of game elements with psychometric qualities, learning, and
enjoyment in game-based assessment[J]. Computers & Education, 2015, 87: 340-356.
Malone T W. What makes things fun to learn? Heuristics for designing instructional computer games[C].
Proceedings of the 3rd ACM SIGSMALL symposium and the first SIGPC symposium on Small systems,
ACM, 1980: 162-169.
Nicholson S. A User-Centered Theoretical Framework for Meaningful Gamification [C]. Madison:
Games+Learning+Society 8.0, 2012.
Rea-Dickins P. Mirror, Mirror on the Wall: Identifying Processes of Classroom Assessment [J]. Language
Testing, 2001, 18(4):429-462.
Tsai Fu-Hsing, Chin-Chung Tsai, Kuen-Yi Lin. The evaluation of different gaming modes and feedback
types on game-based formative assessment in an online learning environment[J]. Computers & Education,
2015, 81: 259-269.
Ventura M, Shute V. The validity of a game-based assessment of persistence[J]. Computers in Human
Behavior, 2013, 29(6): 2568-2572.
Woodrow L. College English writing affect: Self-efficacy and anxiety [J]. System, 2011, (39):510-522.
Wynne Harlen, Mary James. Assessment and Learning: differences and relationships between formative
and summative assessment [J]. Assessment in Education: Principles, Policy & Practice, 1997, 4:3, 365-379
Young D J. Creating a low-anxiety classroom environment: What does language anxiety research suggest?
[J]. The modern language journal. 1991:75(4):426-437.
Y-S Cheng. A measure of second language writing anxiety: Scale development and preliminary validation
[J]. Journal of Second Language Writing, 2004,13:313-335.
Zichermann G, Cunningham C. Gamification by Design: Implementing Game Mechanics in Web and
Mobile Apps [M]. New York: O’Reilly Media Inc., 2011.
公开日期:

 2019-06-14    

中文文本分析量化指标体系的研究与应用.杨雨萌

链接

题名:

 中文文本分析量化指标体系的研究与应用    

作者:

 杨雨萌    

学号:

 1601210811    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 俞敬松    

导师单位:

 软件与微电子学院    

答辩日期:

 2019-05-27    

题目(外文):

 Research and Application of Quantitative Indices System for Chinese Textual Analysis    

关键字(中文):

 中文文本分析 语言特征 量化指标 自然语言处理    

关键字(外文):

 Chinese textual analysis Linguistic feature Quantitative indices Natural language processing    

文摘:

文本特征的自动量化分析是通过计算机程序实现文本特征的定量评估。文本量化的一大核心是建立一组反映文本特征的指标体系。相较于人工分析,定量分析文本特征更加客观和高效。因此,在西方它已被应用在字母类语言的话语分析、语料库研究等领域。目前,文本特征量化指标的研究多以英文文本为对象,中文文本分析量化指标研究较少。从现有研究来看,中文文本分析主要存在三个问题。第一,现有的中文文本量化指标体系化不足,研究较为单一且不够全面;第二,缺少一个中文文本自动量化分析系统;第三,中文文本量化分析指标体系和量化分析系统的应用价值亟待研究和证明。

基于上述中文文本分析量化指标研究中存在的问题和不足,本研究围绕现代汉语,确立了文本分析量化指标研究、自动量化分析工具的实现、量化指标的应用三大研究主线。第一,在英文文本分析量化指标的研究基础上,以汉语特点和汉语语法为纲,建立了适用于中文文本分析的量化指标体系;第二,以中文文本量化指标体系为基础,依托于自然语言处理技术与中文语言资源,设计并实现了一款中文文本量化分析工具;第三,将中文文本量化分析工具应用于现代汉语文本特征研究和文本分级模型。

本研究建立的中文文本分析量化指标体系聚焦于文本的语言特征,总共包括五个层面:描述性特征层面、汉字层面、词汇层面、句子层面和语篇层面。描述性特征层面包括12个指标;汉字层面包括32个指标;词汇层面包括67个指标;句子层面包括60个指标;语篇层面包括1个指标。整个文本量化指标体系包含的指标共计170余项。本研究以两个应用为例,阐述了中文文本分析量化指标体系和量化分析系统的实用价值。第一,本研究以人教版小学语文教科书课文为例,从汉字、词汇和句子等五个层面,对语料进行了较为全面地统计分析。第二,本研究基于机器学习算法和量化指标体系,构建了文本分级模型,模型的预测准确度高达0.90左右。

研究数据结果显示,随着年级的上升,小学教科书课文的字词量、汉字复杂度、词汇难度和句法复杂度等特征值均呈现上升态势,基本遵循了从简到难的编排特点。然而,课文的用字用词仍存在改进之处。例如低年级课文中出现了较多的非常用字词;部首表收录内容与课文用字的关联度较弱等。这些研究结果已被应用于北京大学俞敬松老师研究小组的相关教学研究中。此外,本研究构建的文本分级模型能参照标准教科书,预测文本的阅读级别,从而被应用于不同阅读级别文本的自动分类和筛选。

文摘(外文):

automated textual analysis is to analyze text features quantitatively with computer programs. how to build a group of indicators that can reflect text characteristics is one of the core issues of textual analysis. compared with manual analysis of text, quantitative analysis of text is ive and efficient. therefore, it has been applied in the discourse analysis and the research of corpus of alphabetic languages in the western world. currently, most of the studies in automated textual analysis focus on english texts, and the studies in chinese textual analysis are rare. three major shortages can be found in the existing studies in chinese textual analysis. firstly, the existing studies focusing on textual features do not take a systematic and comprehensive approach. secondly, there is no tool available for analyzing chinese texts. thirdly, the practical value of quantitative indices system and the automated tool is still to be researched and validated.

 

to solve the problems of the previous studies related to quantitative indices system mentioned above, and fill the research gap, this research mainly focuses on three issues related to the analysis of modern chinese texts. firstly, in this research, a quantitative indices system is established for chinese textual analysis based on chinese characteristics and chinese grammar. secondly, a tool for the automated analysis of chinese texts based on the indices system is designed and built with the support of natural language processing technology and chinese language resources. furthermore, the tool is used in the analysis of modern chinese texts and the establishment of text leveling models.

 

the quantitative indices system for chinese textual analysis mainly focuses on linguistic features. this system consists of five levels: deive indices, chinese character, words, sentence, and discourse, with 12, 32, 67, 60, and 1 indicator for each level, respectively. in total, the indices system has more than 170 indicators. furthermore, in this research, two applications are used as examples to prove the practical value of quantitative indices system and the automated tool. in the first application, the linguistic features of primary school textbooks, published by the people’s education press, are extracted. the textbooks are analyzed thoroughly on the five levels, including chinese character, words, and sentence. in the second application, the quantitative indices system is integrated with machine learning algorithms, and text leveling models for chinese texts are built. the prediction accuracy of text leveling models is about 0.90.

 

according to the results, many linguistic values related to the features of textbooks such as word count, chinese character complexity, vocabulary level, and syntactic complexity, show an upward trend as the year increases, indicating that in general the texts are simpler for students in lower years, and are more complex for students in higher years. however, there is still room for improvements regarding the use of characters and words. for example, there are lots of uncommon words in the textbooks for students in lower years. besides, the content of the radical table has little connection with the chinese characters used in textbooks. these findings have been used to support the relevant research by jingsong yu research team of peking university. furthermore, the text leveling models can be used to predict the reading level of chinese texts with the reference to the levels of standard textbooks, and therefore, these models can be used for automated classification and selection of chinese reading texts.

分类号:

 TP3    

论文总页数:

 147    

参考文献数:

 86    

参考文献:
[1] Freud S. The Psychopathology of Everyday Life [M]. New York: W.W. Norton & Company, 1989: 50.
[2] 王力. 汉语语法纲要[M]. 上海: 上海教育出版社, 1982.
[3] 吴思远,蔡建永,于东, 等. 文本可读性的自动分析研究综述[J]. 中文信息学报, 2018, 32(12):1-10.
[4] Kincaid J P, Fishburne Jr R P, Rogers R L, et al. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) For Navy Enlisted Personnel[R]. Institute for Simulation and Training, 1975: 56.
[5] Gunning R. The Technique of Clear Writing [M]. New York: McGraw-Hill, 1952.
[6] Caylor J S, Sticht T G, Fox L C, et al. Methodologies for Determining Reading Requirements of Military Occupational Specialties [J]. Human Resources Research Organization, 1973: 81.
[7] 张宁志. 汉语教材语料难度的定量分析[J]. 世界汉语教学, 2000, 3: 83-88.
[8] 王蕾. 初中级日韩留学生文本可读性公式初探[硕士学位论文]. 北京: 北京语言大学, 2005.
[9] 郭望皓. 对外汉语文本易读性公式研究[硕士学位论文]. 上海: 上海交通大学, 2010.
[10] 孙刚. 基于线性回归的中文文本可读性预测方法研究[硕士学位论文]. 南京: 南京大学, 2015.
[11] 荆溪昱. 中文国文教材的适读性研究: 适读年级值的推估[J]. 教育研究资讯, 1995, 3(3): 113-127.
[12] 杨金余. 高级汉语精读教材语言难度测定研究[硕士学位论文]. 北京: 北京大学, 2008.
[13] 左虹,朱勇. 中级欧美留学生汉语文本可读性公式研究[J]. 世界汉语教学, 2014, 28(2): 263-276.
[14] 杨孝溁. 实用中文报纸可读性公式[J]. 新闻学研究, 1974, 13:37-62.
[15] 张必隐, 孙汉银.中文易懂性公式[A]. //北京师范大学. 中美教育问题研讨会论文集[C]. 1992: 246-249.
[16] 罗素华. 汉语中级泛读教材难度定量分析[硕士学位论文]. 长沙: 湖南师范大学, 2015.
[17] Bååth R. ChildFreq: An Online Tool to Explore Word Frequencies in Child Language [J]. Lucs Minor, 2010, 16: 1-6.
[18] Marsden E, Myles F, Rule S, et al. Using Childes Tools for Researching Second Language Acquisition [J]. British Studies in Applied Linguistics, 2003, 18: 98-113.
[19] Tausczik Y R, Pennebaker J W. The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods [J]. Journal of Language and Social Psychology, 2010, 29(1): 24-54.
[20] Rude S, Gortner E M, Pennebaker J. Language Use of Depressed and Depression-Vulnerable College Students [J]. Cognition & Emotion, 2004, 18(8): 1121-1133.
[21] Newman M L, Pennebaker J W, Berry D S, et al. Lying Words: Predicting Deception from Linguistic Styles [J]. Personality and Social Psychology Bulletin, 2003, 29(5): 665-675.
[22] 黄金兰, 林以正, 谢亦泰, 等. 中文版语文探索与字词计算词典之建立[J]. 中华心理学刊, 2012, 54(02):185-201.
[23] 张信勇. LIWC: 一种基于语词计量的文本分析工具[J]. 西南民族大学学报: 人文社会科学版, 2015, 36(4): 101-104.
[24] Sung Y T, Chang T H, Lin W C, et al. CRIE: An Automated Analyzer for Chinese Texts [J]. Behavior Research Methods, 2016, 48(4): 1238-1251.
[25] Graesser A C, McNamara D S, Kulikowich J M. Coh-Metrix: Providing Multilevel Analyses of Text Characteristics [J]. Educational Researcher, 2011, 40(5): 223-234.
[26] McNamara D S, Graesser A C, McCarthy P M, et al. Automated Evaluation of Text and Discourse with Coh-Metrix [M]. Cambridge: Cambridge University Press, 2014.
[27] 江进林. Coh-Metrix工具在外语教学与研究中的应用[J]. 中国外语, 2016, 13(5): 58-65.
[28] 张琇涵, 倪雅真, 廖晨惠, 等. 实词笔、一词多义、笔画数指标建置中文文本自动化分析系统[C]. 第八届信息科技国际研讨会. 2014.
[29] 倪雅真. 儿童文本句子相似度指标及可读性公式建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2014.
[30] 蔡筱倩. 儿童文本词频词汇指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013.
[31] 黄勇媜. 儿童文本语词重复指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013.
[32] 陈文兰. 儿童文本关联词指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013.
[33] 陈建宏. 儿童文本词类指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013.
[34] 蔡亚韦. 儿童文本潜在语义指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013.
[35] 叶静如. 中文文本词汇多样化自动化分析系统建置与探讨[硕士学位论文]. 台中: 国立台中教育大学, 2014.
[36] 宋曜廷, 陈茹玲, 李宜宪, 等. 中文文本可读性探讨: 指针选取, 模型建立与效度验证[J]. 中华心理学刊, 2013, 55(1): 75-106.
[37] 别小雷. 基于“新大纲”的《新实用汉语课本》语料难度定量分析[硕士学位论文]. 成都: 西南交通大学, 2017.
[38] 吕禾. 新旧HSK词汇大纲比较研究[J]. 黑龙江社会科学, 2012(4): 134-136.
[39] 刘又辛. 汉语汉字答问[M]. 北京: 商务印书馆, 1997: 65.
[40] 索绪尔. 普通语言学教程[M]. 北京: 商务印书馆, 1980: 50-51.
[41] 张旺熹. 从汉字部件到汉字结构——谈对外汉字教学[J]. 世界汉语教学, 1990(2): 112-120.
[42] 邢红兵. 《(汉语水平)汉字等级大纲》 汉字部件统计分析[J]. 世界汉语教学, 2005(2): 49-55.
[43] 陈小雨. 泰国学生HSK4级中多音字教学研究[硕士学位论文]. 天津: 天津师范大学, 2018.
[44] 胡裕树, 许宝华. 现代汉语[M]. 上海: 上海教育出版社, 1981.
[45] 葛本仪. 现代汉语词汇学[M]. 第三版. 北京: 商务印书馆, 2014: 28.
[46] 吕叔湘. 说 “自由” 和 “粘着”[J]. 中国语文, 1962(1): 1-6.
[47] 刘中富. 现代汉语词汇特点初探[J]. 东岳论丛, 2002(6):138-142.
[48] 万宇,朱颖华.汉语儿童阅读分级与西方阅读分级的差异研究[J]. 图书馆杂志, 2016, 35(5):106-111.
[49] 黄伯荣, 廖序东. 现代汉语[M]. 修订版. 兰州: 甘肃人民出版社, 1983.
[50] 刘月华. 实用现代汉语语法[M]. 增订本. 北京: 商务印书馆, 2001: 3.
[51] 朱德熙. 语法讲义[M]. 北京: 商务印书馆. 1982: 22-23.
[52] 邵敬敏. 现代汉语通论[M]. 上海: 上海教育出版社, 2001.
[53] 池昌海. 现代汉语语法修辞教程[M]. 杭州: 浙江大学出版社, 2014:120.
[54] 李庆荣. 现代实用汉语修辞[M]. 修订版. 北京: 北京大学出版社, 2010: 3.
[55] 聂仁发. 现代汉语语篇研究[M]. 杭州: 浙江大学出版社, 2009: 124-129.
[56] 艾伟. 汉字问题[M]. 北京: 商务印书馆. 2017: 12-15.
[57] 喻柏林,冯玲,曹河圻, 等. 汉字和人工“字”部件识别的比较研究[J]. 心理科学,1991(5): 1-5.
[58] 许慎. 说文解字[M]. 北京: 线装书局, 2016: 1670-1675.
[59] 潘文. 现代汉字的定义及其结构方式[J]. 南京师范大学文学院学报, 2001(4): 82-87.
[60] Sun C C, Hendrix P, Ma J, et.al. Chinese Lexical Database (CLD) : A Large-scale Lexical Database for Simplified Mandarin Chinese[J]. Behavior Research Methods. 2018.
[61] 国家汉语水平考试委员会办公室考试中心. 汉语水平词汇与汉字等级大纲[M]. 北京: 经济科学出版社. 2001.
[62] 国家汉语水平考试委员会办公室考试中心, 孔子学院总部. 新汉语水平考试大纲[M]. 北京: 商务印书馆, 2009.
[63] 国家语言文字工作委员会. 现代汉语常用字表[M]. 北京: 语文出版社, 1988.
[64] 董振东, 董强, 郝长伶. 知网的理论发现[J]. 中文信息学报, 2007, 21(4): 3-9.
[65] 《现代汉语常用词表》课题组. 现代汉语常用词表[M]. 北京: 商务印书馆出版社, 2008.
[66] 马芝兰. 现代汉语语法的综合研究[M]. 北京: 中国书籍出版社, 2016:43.
[67] McCarthy P M, Jarvis S. MTLD, Vocd-D, and HD-D: A Validation Study of Sophisticated Approaches to Lexical Diversity Assessment[J]. Behavior Research Methods, 2010, 42(2): 381-392.
[68] 吴勇毅, 吴中伟, 李劲荣. 实用汉语教学语法[M]. 北京: 北京大学出版社, 2016.
[69] 国家对外汉语教学领导小组办公室汉语水平考试部. 汉语水平等级标准与语法等级大纲[M]. 北京: 高等教育出版社, 1996.
[70] 苏新春. 现代汉语分类词典[M]. 北京: 商务印书馆, 2013: 343-345.
[71] 俞士汶, 朱学峰. 现代汉语语法信息词典[DB/OL]. 北京大学开放研究数据平台, V3. 2017, http://dx.doi.org/ 10.18170/DVN/EDQWIL.
[72] 苗夺谦. 中文信息处理原理及应用[M]. 第二版. 北京: 清华大学出版社, 2015: 122.
[73] Crossley S A, Greenfield J, McNamara D S. Assessing Text Readability Using Cognitively Based Indices[J]. Tesol Quarterly, 2008, 42(3): 475-493.
[74] 张振亚, 王进, 程红梅, 等. 基于余弦相似度的文本空间索引方法研究[J]. 计算机科学, 2005, 32(9): 160-163.
[75] Liu H, Xu C, Liang J. Dependency Distance: A New Perspective on Syntactic Patterns in Natural Languages[J]. Physics of Life Reviews, 2017, 21: 171-193.
[76] 陆前, 刘海涛. 依存距离分布有规律吗?[J]. 浙江大学学报 (人文社会科学版), 2016: 1.
[77] Manning C, Surdeanu M, Bauer J, et al. The Stanford CoreNLP Natural Language Processing Toolkit[C]. Proceedings of 52nd Annual Meeting of The Association for Computational Linguistics: System Demonstrations. 2014: 55-60.
[78] Jieba中文分词组件[CP/OL]. (2018-12-3) [2019-4-8]. https://github.com/fxsjy/jieba.
[79] Huang F. 哈工大PyLTP工具包[CP/OL]. [2019-4-8]. https://github.com/HIT-SCIR/pyltp.
[80] Han H. HanLP开源汉语言处理包[CP/OL]. [2019-4-1]. https:// github.com/hankcs/HanLP.
[81] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]. Advances in Neural Information Processing Systems. 2013: 3111-3119.
[82] 腾讯AI开放平台[CP/OL]. [2019-2-24]. https://ai.qq.com/.
[83] Jones K S. A Statistical Interpretation of Term Specificity and Its Application in Retrieval[J]. Journal of Documentation, 2004, 28(1):493-502.
[84] 周有光. 现代汉字学发凡[J]. 语文现代化丛刊, 1980 (2).
[85] 中华人民共和国教育部. 义务教育语文课程标准[M]. 北京: 北京师范大学出版社, 2011: 5.
[86] 魏贞原. 机器学习Python实践[M]. 北京: 电子工业出版社, 2018: 3.
公开日期:

 2022-06-10    

医学英语词典的研究与设计.尹梦佳

链接

题名:

 医学英语词典的研究与设计    

姓名:

 尹梦佳    

学号:

 1601210822    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 俞敬松    

导师1单位:

  软件与微电子学院    

论文答辩日期:

 2019-05-27    

关键词:

 医学英语 词汇 词典 个性化设计    

论文摘要:

       人类社会进入21世纪以来,科技的飞速发展带动了各个领域的不断进步,与此同时,人们在各个专业化领域的需求也在不断探索和前进。因此,传统的专业领域工具已经无法满足人们的需要。对于医学领域人士(医生、医学生、医学爱好者等)来说,一款高效的医学英语电子词典是他们工作、学习必不可少的好帮手。

       然而,纵观过往的医学英语电子词典产品,大多存在以下三个问题:第一,依附于通用型词典之上,对于医学领域的专业广度和深度拓展不够。词汇和相关内容的搜索展示仍停留在通用词水平,无法为专业人士提供符合他们专业程度的词汇查阅需求。第二,词典功能单一。当前市面上流行的许多医学电子词典都将功能局限在“查词”上,无法为专业医学人士日常所需的文献阅读和论文撰写提供较为便利的解决方案。第三,无法辨别医学用户的专业偏好。目前,多数常用电子词典对于医学用户的专业偏好并没有区分,从而无法从专业的科室角度为用户提供高效的查词体验,也无法为用户提供个性化的搜索和展示。

      为了解决上面三个问题,本研究首先进行了医学英语词汇的构成分析,研究英语词汇的特点,提升词典设计的合理性。然后从医学英语词汇联结的角度入手,研究了UMLS和SNOMED CT两大医学界较为权威的英语词汇系统,从中提取医学英语词汇之间的相互联系。并且,为了解决现存医学数据库中词汇科室不明的问题,本研究采取机器算法与人工校对相结合的方式,利用不同科室的核心词汇表,加上提取的词汇联系网络,将获得的医学词汇数据进行科室分类。最后,本研究结合个性化产品设计的思路,辅之以针对医学用户日常所需的便利功能,如写作助手和阅读助手,为医学领域的专业用户设计了一款个性化的医学英语电子词典。

      为了验证本研究设计的词典的实用性,本研究邀请了来自浙江大学医学院的专家与同学参与了有效性验证。实验表明,本研究设计的电子词典可以有效提高用户查词的效率。此外,词典系统中的辅助功能,如写作、阅读助手等,也帮助用户提高了学习与工作的效率。本研究设计的医学英语电子词典有效地解决了当前医学电子词典中存在的专业化和个性化问题,帮助用户提高了学习和工作效率,对医学英语电子词典的研究与发展有一定参考价值。

外文摘要:

  Great changes have appeared in various fields with the rapid development of science and technology since the beginning of the 21st century. At the same time, the needs of human beings in various specialization fields are also being advanced. Therefore, traditional professional learning tools can no longer meet people's needs. For people in the medical field (doctors, medical students, medical enthusiasts, etc.), an efficient professional medical English dictionary is an essential helper for their work and study.

  However, when it comes to the past medical English dictionaries, there are following three problems in the most of these dictionaries: Firstly, these dictionaries are one part of ordinary dictionaries, in which users are difficult to find professional knowledge, especially for professional medical users. Some uncommon knowledge, such as diseases, treatments and so on, also attracted medical users. Secondly, most dictionaries nowadays have a single function of word-retrieval. It is difficult for a simple dictionary to satisfy medical professionals’ daily needs since they have to read and write professional literatures in work and study. Thirdly, all users are treated equally when using those dictionaries, which means that they are difficult to distinguish the professional preferences of medical users, nor is it possible to provide users with efficient word- retrieval and word-display experience.

  In order to solve the three problems above, this study firstly carries on the characteristic analysis of medical English vocabulary to improve the professionalism of dictionary design. Then, from the point of view of medical English vocabulary association, this paper studies the two authoritative English vocabulary systems, UMLS and SNOMED CT, from which the relationship between medical English vocabulary is extracted. Moreover, in order to solve the problem that vocabulary departments in the existing medical database are unclear, this study adopts the way of combining machine algorithm with manual proofreading, uses the core vocabulary of 18 departments, and adds the extracted vocabulary connection network. The obtained medical vocabulary data thus are classified into sections. Finally, combined with the idea of personalized product design, this study designs a personalized medical English dictionary for professional users in the medical field, supplemented by some convenient functions for medical users.

  To verify the practicability of the medical English dictionary designed in this study, experts and students from School of Medicine in Zhejiang University were invited to participate in the verification of validity. The results of experiments show that the dictionary system designed in this study can improve the efficiency of word-retrieval. In addition, other auxiliary functions in the dictionary system, such as writing and reading assistant, also help users to study and work more efficiently. The medical English dictionary designed in this study has solved the specialization and individualization problems existed in the current medical dictionaries, and it also helps users study and work efficiently. What’s more, the study has certain reference value for the further development of the medical English dictionary design. 

分类号:

 TP3    

论文总页数:

 79    

参考文献总数:

 36    

参考文献列表:
戴远君, 徐海. 电子词典研究现状与展望[J]. 辞书研究. 2014.
郭玉峰, 刘保延, 崔蒙, 李平, 杨阳. SNOMED CT内容简介[J]. 中国中医药信息杂志. 2006.
何志兰, 崔杜武. 一种新的用于电子词典的数据压缩算法[J]. 计算机工程, 2005(21): 186-188.
黄艺锋, 闫巧. 基于Android平台电子词典的设计与实现[J]. 计算机应用, 201l(31): 228-232.
黄永, 陆伟, 程齐凯, 等. 学术文本的结构功能识别——在学术搜索中的应用[J]. 情报学报, 2016, 35(4): 425-431.
黄永, 陆伟, 程齐凯, 等. 学术文本的结构功能识别——基于段落的识别[J]. 情报学报. 2016, 35(5): 530-538.
孔行. 基于主题推荐的辅助写作系统[D]. 哈尔滨: 哈尔滨工业大学, 2015.
雷声伟, 陈海华, 黄永, 等. 学术文献引文上下文自动识别研究[J]. 图书情报作, 2016(17): 78-87.
梁春成, 邢洪波. 电子词典在单片机系统中的应用方法[J]. 微计算机应用, 2001(5): 318-321.
陆伟, 黄永, 程齐凯, 等. 学术文本的结构功能识别——功能框架及基于章节标题的识别[J]. 情报学报, 2014(9): 979-985.
伊马木·达吾提. 电子词典数据压缩算法的设计与实现[J]. 信息与电脑(理论版), 2010(8):110-111.
买日旦·吾守尔, 维尼拉·木沙江. 电子词典软件系统中对维、哈、柯文进行自动判别技术的研究[J]. 新疆大学学报(自然科学版), 2011(1): 88-92.
穆念伟. 医学英语词汇特点[J]. 中华医学写作杂志. 2004, 11(23): 2035-2040.
荣岩. 医学英语词汇学习系统研究与设计[D]. 北京: 北京大学, 2018.
孙枫军. 引文上下文中的概念抽取[D]. 北京: 中国科学技术信息研究所, 2012.
田莺, 杨中华. 基于Qt/Embedded的电子词典的设计与实现[J]. 信息化纵横, 2009(14): 2l-23.
王世杰, 赵玉华, 吴永胜. 基于语料库的医学英语词汇[M]. 兰州: 兰州大学出版社, 2013: 1
吴鹏, 李灵华. 实现基于Google Android平台的电子词典相关技术探讨[J]. 电脑知识与技术, 2011(34): 8876-8878.
Yang. P C. WriteAhead: 以学术论文写作为目的之摘要写作辅助系统[D]. 清华大学资讯系统与应用研究所学位论文, 2009: 1-55
杨明山. 医学英语术语教程[M]. 上海: 上海中医药大学出版社, 2006.
张金松. 基于引文上下文分析的文献检索技术研究[D]. 大连: 大连海事大学, 2013.
章宜华. 计算词典学与新型词典[M]. 上海: 上海辞书出版社, 2004.
周晓音. SNOMED CT在临床路径中应用探讨[J]. 医学信息学杂志, 2010.
Angroshm A, Cranefields, Stabger N. Context identification of sentences in related work sections using a conditional random field: towards intelligent digital libraries[C]. Proceedings of the 10th annual joint conference on Digital libraries, Gold Coast: ACM, 2010: 293-302.
Atkins B T S, Rundell M. The Oxford Guide to Practical Lexicography[M]. New York: Oxford University Press, 2008.
Bejoint H. The Lexicography of English. Oxford: Oxford University Press, 2010.
Chen M H, Huang S T, Hsieh H T, et al. Flow: a first-language-oriented writing assistant system[J]. ACL System Demonstrations, 2012, 24(3): 157-162.
Chen Y. Dictionary Use and EFL Learning: A Contrastive Study of Pocket Electronic Dictionaries and Paper Dictionaries[J]. International Journal ofLexicography, 2010(3): 275-306.
Chen Y. Studies on Bilingualized Dictionaries: The User Perspective[J]. International Journal of Lexicography, 2011(2): 161-197.
De Schryver G M. Lexicographers’ Dreams in the Electronic-Dictionary Age[J]. International Journal of Lexicography, 2003(2): 143-199.
Dziemianko A. Paper or Electronic: The Role of Dictionary Form in Language Reception, Production and the Retention of Meaning and Collocations[J]. International Journal of Lexicography, 2010(3): 257-273.
Frankenberg-Garcia A. Learners’ Use of Corpus Examples[J]. International Journal of Lexicography, 2012(3): 273-296.
Granger S, Paquot M. Electronic Lexicography[M]. Oxford: Oxford University Press, 2012.
Hacken P T, Abel A. Knapp J. Word Formation in an Electronic Learners’Dictionary: ELDIT[J]. International Journal ofLexicography, 2006(3): 243-256.
Michael C. Searching and mining the web for personalized and specialized information[D]. Tempe: The University of Arizona, 2003.
Svensen B A. Handbook of Lexicography: The Theory and Practice of Dictionary-Making[M]. Cambridge: Cambridge University Press, 2009.
公开日期:

 2019-06-04    

多维度智能英语词汇学习知识库研究.屠少辉

链接

题名:

 多维度智能英语词汇学习知识库研究    

姓名:

 屠少辉    

学号:

 1601210727    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 姚亚芝    

导师1单位:

 软件与微电子学院    

导师2姓名:

 高志军    

导师2单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-27    

外文题名:

 Study on Multi-dimensional Knowledge Base System for English Vocabulary Intelligent Learning    

关键词:

 英语词汇学习 多维度 知识库 资源加工    

外文关键词:

 English vocabulary learning Multi-dimensional Knowledge base Resource processing    

论文摘要:

词汇教学是英语教学的重要组成部分。在词汇教学中,教学材料直接影响到学生的学习效果。经过调研,发现虽然目前可使用的英语词汇学习资料丰富多样,但是内容分散,良莠不齐,无法完全满足教师和学生的资源需求。如果能够发挥计算机和互联网技术的优势,对词汇学习材料进行采集和整合,将对英语词汇教学质量的提升有所助益。

为整合英语词汇教学中的资源,本研究广泛地收集词汇学习材料,使用计算机技术对文本进行加工,根据一定逻辑对素材进行组织,建设英语词汇学习知识库。本文研究的关键问题有知识库建设的方法和思路、大规模资源汇总、资源加工存储的规范标准和自动处理程序的设计与实现。

本研究从互联网上采集了大量词汇学习资源,进行汇总,建设资源库。之后在文献分析的基础上构建词汇知识模型,并以此作为整合资源的逻辑依据。词汇知识模型从习得过程和知识内容两个角度出发组织词汇知识,响应了学习者在习得词汇过程中词汇知识需求的动态变化。模型以词汇习得过程为框架,词汇知识为内容,将词汇习得过程分为感知、理解、联想和输出四个阶段,让知识内容聚合到音位、形式、语境、语义、搭配、词源、产出和主题八个维度下。在建设知识库时,根据词汇知识模型设计知识库的结构和资源加工的规范标准,对例句和搭配的自动抽取等关键问题进行研究。最终,本研究整合了通识英语教学中的词汇学习资源,解决了知识库建设中的关键问题,开发了自动处理程序,实现了一定规模的词汇学习知识库。

对于整合后的词汇学习资源,本文进行了客观指标评估、准确率抽样检查和场景检查,证明了知识库能够满足教师和学生的资源需求,帮助改善词汇教学的效果。

本研究的创新之处包括以下三点:1)在文献研究的基础上构建词汇知识模型,根据知识模型从各类词汇学习资料中针对性地提取内容,然后进行整合,保证了内容的丰富性和资源间的关联性,去掉了重复性和低质量的学习材料,是一种语言学习资源建设的新思路;2)知识库将学习资源聚合到不同的维度下,实现了资源的模块化,在应用知识库时,可以根据需求灵活地调用和组织内容;3)本研究在加工资源时,对自动处理程序中的关键问题进行研究,使用自然语言处理等计算机技术提高资源建设的效率。

词汇教学是英语教学的重要组成部分。在词汇教学中,教学材料直接影响到学生的学习效果。经过调研,发现虽然目前可使用的英语词汇学习资料丰富多样,但是内容分散,良莠不齐,无法完全满足教师和学生的资源需求。如果能够发挥计算机和互联网技术的优势,对词汇学习材料进行采集和整合,将对英语词汇教学质量的提升有所助益。

为整合英语词汇教学中的资源,本研究广泛地收集词汇学习材料,使用计算机技术对文本进行加工,根据一定逻辑对素材进行组织,建设英语词汇学习知识库。本文研究的关键问题有知识库建设的方法和思路、大规模资源汇总、资源加工存储的规范标准和自动处理程序的设计与实现。

本研究从互联网上采集了大量词汇学习资源,进行汇总,建设资源库。之后在文献分析的基础上构建词汇知识模型,并以此作为整合资源的逻辑依据。词汇知识模型从习得过程和知识内容两个角度出发组织词汇知识,响应了学习者在习得词汇过程中词汇知识需求的动态变化。模型以词汇习得过程为框架,词汇知识为内容,将词汇习得过程分为感知、理解、联想和输出四个阶段,让知识内容聚合到音位、形式、语境、语义、搭配、词源、产出和主题八个维度下。在建设知识库时,根据词汇知识模型设计知识库的结构和资源加工的规范标准,对例句和搭配的自动抽取等关键问题进行研究。最终,本研究整合了通识英语教学中的词汇学习资源,解决了知识库建设中的关键问题,开发了自动处理程序,实现了一定规模的词汇学习知识库。

对于整合后的词汇学习资源,本文进行了客观指标评估、准确率抽样检查和场景检查,证明了知识库能够满足教师和学生的资源需求,帮助改善词汇教学的效果。

本研究的创新之处包括以下三点:1)在文献研究的基础上构建词汇知识模型,根据知识模型从各类词汇学习资料中针对性地提取内容,然后进行整合,保证了内容的丰富性和资源间的关联性,去掉了重复性和低质量的学习材料,是一种语言学习资源建设的新思路;2)知识库将学习资源聚合到不同的维度下,实现了资源的模块化,在应用知识库时,可以根据需求灵活地调用和组织内容;3)本研究在加工资源时,对自动处理程序中的关键问题进行研究,使用自然语言处理等计算机技术提高资源建设的效率。

外文摘要:

vocabulary teaching holds an important position in english language teaching and the materials directly influence the learning effect of students. it was found that although the english vocabulary learning resources currently available are rich and varied after surveying teachers and students, problems do exists with the quality and organization of these materials which fail to fully meet the resource needs of teachers and students. if vocabulary learning materials were collected and integrated with computer and internet technologies, the quality of english vocabulary teaching could be improved.

in order to integrate resources in english vocabulary teaching, the author collected quantities of vocabulary learning materials, applied computer technology to process texts, organized materials according to certain logic, finally realized an english vocabulary learning knowledge base. the key issues in this study are the methods and ideas of knowledge base construction, gathering large-scale resources, standardizing the resource processing and storage, and designing automatic text processing programs.

this study collected a large amount of vocabulary learning resources from the internet based on which a resource library was built. then, after the literature analysis, the author proposed a vocabulary knowledge model as the logical basis for integrating resources. the vocabulary knowledge model organizes vocabulary knowledge from the perspectives of acquisition process and knowledge content, responding to the dynamic changes of vocabulary knowledge needs of learners in the procedure of acquiring vocabulary. the model takes the vocabulary acquisition process as the framework and the vocabulary knowledge as the content. the vocabulary acquisition process is divided into four stages: perception, understanding, association and output. the knowledge content is aggregated into the eight dimensions of phoneme, word form, context, semantic, collocation, topic, source and output. when constructing the knowledge base, the knowledge base structure and the normative standards of resource processing were designed according to the vocabulary knowledge model, and key issues such as automatic extraction of example sentences and collocations were studied. in the end, this study integrated the vocabulary learning resources in general english teaching, solved the key problems in the construction of knowledge base, developed automatic processing programs, successfully built a vocabulary learning knowledge base of a certain scale.

for the integrated vocabulary learning resources, this paper conducted ive index evaluation, accuracy sampling and scene inspection, which proved that the knowledge base could meet the resource needs of teachers and students, improving the effect of vocabulary teaching.

the innovations of the study include the following three points: 1) by extracting and integrating resources from various vocabulary learning materials according to the knowledge model, the study can ensure the richness and connection of the resources in the knowledge base, moreover, remove the repetitive and low-quality learning materials. this is a new way to build language learning resources; 2) the knowledge base aggregates learning resources into different dimensions and realizes the modularization of resources. when applying the knowledge base, the users or applications can flexibly invoked and organized the resources according to their requirements; 3) this paper studied the key issues in the automatic processing program when integrating resources, and used computer technology such as natural language processing to improve the efficiency of resource construction.

分类号:

 TP3    

论文总页数:

 53    

参考文献总数:

 70    

参考文献列表:
[1] zimmerman c b. historical trends in second language vocabulary instruction // j. coady, t. huckin (eds.) second language vocabulary acquisition. cambridge: cambridge university press. 1997:146-163.
[2] 吕菲, 齐聪. 初中英语词汇教学研究述评[j]. 教育现代化, 2018, 5(40):391-393.
[3] 邓亿书, 汤成强. 移动互联网视野下高中英语教育资源库建设探究[j]. 科学咨询(科技·管理), 2018, 602(09):114.
[4] 陈旭辉. 技术支持下的初中英语词汇教学[j]. 内蒙古师范大学学报, 2012, 25(6):125-127.
[5] 俞士汶, 朱学锋. 综合型语言知识库及其在语言教学中的应用[j]. 北华大学学报(社会科学版), 2014, 15(3):4-9.
[6] 刘书昊. 基于网络平台的研究生专业英语词汇预习资源设计[d]. 沈阳师范大学. 2018.
[7] 黄建滨. 关于《大学英语教学大纲(修订本)》词汇表的说明[j]. 外语界, 1999(4):27-31.
[8] 马广惠, 黄文, 苗娟, 等. 大学非英语专业新生英语入学水平测试与分析[j]. 南京师大学报(社会科学版), 2006(1):82-88.
[9] 吕长竑. 词汇量与语言综合能力、词汇深度知识之关系[j]. 外语教学与研究(外国语文双月刊), 2004(2):116-123.
[10] 戴俊红. 非英语专业大学生四级阶段词汇量调查[j]. 重庆理工大学学报(社会科学), 2013(1):118-122.
[11] 童淑华. 第二语言产出性词汇习得研究[m]. 吉林: 吉林大学出版社, 2010.
[12] 何道瑞. 新课标下高中英语词汇教学新思路[j]. 北京: 考试(教研版), 2006.
[13] 叶哲琳. 新课标下高中英语词汇教学新思路[j]. 石家庄: 校园英语, 2018(20).
[14] 管淑红. 大学英语词汇使用现状研究[j]. 华东交通大学学报, 2004, 21(3):132-135.
[15] 毛浩然, 林晓琴. 高频优先、共同经验、成功体验--词汇教与学策略新探[j]. 基础教育外语教学研究, 2006(11):21-24.
[16] 戴雪莹. 高中英语词汇教学现状调查分析[d]. 东北师范大学, 2009.
[17] wilkins d a. linguistics in language teaching [m]. london: edward arnold. 1972.
[18] hatch, e. and c. brown. vocabulary , semantics, and language education [m ]. cup, 1995.
[19] 陈新仁. 外语词汇习得过程探析[j]. 外语教学, 2002, 23(4).
[20] skehan p. a cognitive approach to language learning [m]. 上海外语教育出版社, 1999.
[21] garcia p. input, interaction, and the second language learner [m] // input, interaction, and the second language learner. lawrence erlbaum associates, 1997.
[22] schmitt n, m mccarthy. vocabulary: deion, acquisition and pedagogy [c]. cambridge: cup, 1997.
[23] coady j, t. huckin. second language vocabulary acquisition [c]. cambridge: cup, 1997.
[24] nation p. learning vocabulary in another language [m]. cambridge: cup , 2001.
[25] read j. assessing vocabulary [m]. cambridge: cup, 2000.
[26] thornbury s. how to teach vocabulary [m]. longman, 2002 .
[27] nation p. teaching and learning vocabulary [m]. new york: newbury house publishers, 1990.
[28] 张文忠, 吴旭东. 课堂环境下二语词汇能力发展的认知心理模式[j]. 现代外语, 2003, 26(4):373-384.
[29] 刘绍龙. 论二语词汇的习得与发展--基于实证调查的词汇知识发展差异假说[j]. 外语教学, 2003, 24(6):47-50.
[30] jiang n. lexical representation and development in a second language [j]. applied linguistics, 2000(1):47-77.
[31] read j. the development of a new measure of l2 vocabulary knowledge [j]. language testing, 1993(10):355-371
[32] 李红. 第二语言语义提取中的词汇知识效应[j]. 现代外语, 2003, 26(4).
[33] wilks c, meara p. untangling word webs: graph theory and the notion of density in second language word association networks [j]. second language research, 2002, 18(4):303-324.
[34] 马广惠. 二语词汇知识理论框架[j]. 外语与外语教学, 2007(4).
[35] atkinson r c, shiffrin r m. human memory: a proposed system and its control processes [j]. psychology of learning & motivation, 1968, 2:89-195.
[36] craik f i m, lockhart r s. levels of processing: a framework for memory research [j]. journal of verbal learning & verbal behavior, 1972, 11(6):671-684.
[37] craik f i m. depth of processing and the retention of words in episodic memory [j]. journal of experimental psychology general, 1975, 104(3):268-294.
[38] wttrock m c. learning as a generative process [j]. educational psychologist, 1974, 11/1.
[39] hulstijn j h. retention of inferred and given word meanings: experiments in incidental vocabulary learning [m] // vocabulary and applied linguistics. 1992.
[40] 桂诗春. 以语料库为基础的中国学习者英语失误分析的认知模型[j]. 现代外语, 2004, 27(2):129-139.
[41] tinkham s f, weaver lariscy r a. a diagnostic approach to assessing the impact of negative political television commercials [j]. journal of broadcasting & electronic media, 1993, 37(4):377-399.
[42] carter r, mccarthy m. written and spoken vocabulary [a]. cambridge: cambridge university press, 1997.
[43] 韦萍. 英语词缀与词汇学习[j]. 海南热带海洋学院学报, 2011, 18(1):146-147.
[44] 麦莹莹. 英语词根学习的重要性分析[j]. 开封教育学院学报, 2014(2):143-144.
[45] richards j. the role of vocabulary learning [ j ]. tesol quarterly, 1976(10) .
[46] 范琳, 王庆华. 英语词汇学习中的分类组织策略实验研究[j]. 外语教学与研究(外国语文双月刊), 2002, 34(3):209-212.
[47] krashen s d, terrell t d . the natural approach: language acquisition in the classroom.[m]. the alemany press, p.o. box 5265, san francisco, ca 94101. 1983.
[48] cohen a d, aphek e. retention of second-language vocabulary overtime: investigating the role of mnemonic associations[j]. system, 1980, 8(3):221-235.
[49] 文秋芳. 英语学习策略论[m]. 上海: 上海外语教育出版社, 1996.
[50] 桂诗春. 中国学生英语学习心理[m]. 湖南教育出版社, 1992.
[51] 袁玲丽. 联想策略与直接词汇教学研究[j]. 西安外国语大学学报, 2005, 13(3):53-55.
[52] nist s l, olejnik s. the role of content and dictionary definitions on varying levels of word knowledge [j]. reading research quarterly, 1995, 172-193.
[53] laufer b. corpus-based versus lexicographer examples in comprehension and production of new words [m]. // fontenelle t. practical lexicography. oxford: oxford university press.2008:71-76.
[54] 赵海威. 基于行为特征和数据分析的外语词汇学习模型研究[d].北京大学.2017.
[55] 孙晓明. 第二语言词汇研究[m].中央民族大学出版社, 2006.
[56] 李传秀. 词源教学在初中英语词汇教学中的效用研究[d]. 河北师范大学, 2014.
[57] 韩秀莲, 刘立莉. 词源教学与大学英语词汇教学[j]. 卷宗, 2013(6):76-77.
[58] wray a. formulaic language and the lexicon [j]. london: cambridge university press,2002.
[59] altenberg b. on the phraseology of spoken english: the evidence of recurrent word combinations [a] // a cowie(ed.). phraseology: theory,analysis and applications. oxford: oxford university press,1998, 101-108.
[60] 张长岚. the lexicalapproach语境中的词汇[j]. 淮阴师范学院学报, 2000, 6: 109-111.
[61] 丁言仁. 第二语言习得研究与外语学习[m]. 上海外语教育出版社, 2005.
[62] 俞士汶, 段慧明, 朱学锋, 等. 综合型语言知识库的建设与利用[j]. 中文信息学报, 2004, 18(5):2-11.
[63] 支流. 综合型语言知识库系统原型的开发与中文缩略语知识库建设[d]. 北京大学. 2008.
[64] 陈楚祥. 词典评价标准十题[j]. 辞书研究, 1994(1):10-21.
[65] 张后尘. 双语词典质量标准与质量保障对策[j]. 辞书研究, 1995(6):25-33.
[66] 魏向清. 关于构建双语词典批评理论体系的思考[j]. 外语与外语教学, 2001(1).
[67] 范凯. nosql数据库综述[j]. 程序员, 2010(6):76-78.
[68] melo g d , weikum g. extracting sense-disambiguated example sentences from parallel corpora [j]. 2009.
[69] kilgarriff a, baisa v, bušta j, et al. the sketch engine: ten years on[j]. lexicography, 2014, 1(1):7-36.
[70] frank smadja, retrieving collocations from texts [j]: xtract, computational linguistics, 1993, 13-19.
公开日期:

 2019-06-04    

法律英语词汇学习系统研究与设计.包珍

链接

题名:

 法律英语词汇学习系统研究与设计    

姓名:

 包珍    

学号:

 1601210427    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 姚亚芝    

导师1单位:

  软件与微电子学院    

导师2姓名:

 俞敬松    

导师2单位:

  软件与微电子学院    

论文答辩日期:

 2019-05-27    

外文题名:

 Research and Design of a Legal English Vocabulary Learning System    

关键词:

 法律英语词汇 学习资源 词汇学习 词汇复习 词汇推荐策略    

外文关键词:

 Legal English vocabulary Learning resources Vocabulary Learning Vocabulary Review Vocabulary recommendation strategy    

论文摘要:

随着我国涉外法务日益频繁,法律英语的重要性毋庸置疑。想学好法律英语需要先了解它的特殊性,根据前人研究,法律英语的特殊性首要体现在法律英语词汇的特殊性。本研究期望借助移动学习的优势,设计一个法律英语词汇学习系统。

经调研,目前的法律英语词汇教学在学习效率方面存在以下问题未能得到有效解决。第一,目标词汇不成体系,未能建立词汇的属性标注和关联关系。学习资源以纸质教材为主,存在学习内容有限等问题;相关词汇学习系统仅将纸质教材中的词表电子化,未能建立词汇的属性标注和关联关系。第二,词汇学习深度不足,没有突出重难点。一般英语学习的基本词汇信息未能满足法律英语词汇学习的深度需求;没有重点设计近义辨析这一教学难点。第三,词汇考查内容不足,复现形式单一。考查内容只包括词汇的基本信息,缺乏法律英语词汇的特殊信息;仅以测试实现词汇复现,形式单一。第四,一般英语的词汇推荐策略不完全适用于法律英语词汇。词汇优先级计算忽视了法律英语词汇高频使用特别术语的特征;近义词学习未能实现动态推荐。

为解决上述问题,本研究在相关词汇教学理论和二语习得理论的指导下,借助移动学习的优势,设计了法律英语词汇学习系统。第一,多维度整合学习资源确立目标词汇资源库,对词汇的词源和部门法等属性信息进行标注、建立近义词汇的关联关系。第二,根据法律英语学习的深度需求,增加构成要件等学习内容,重点设计近义辨析模块。第三,设计多种复习题型考查法律英语词汇的基本信息和特殊信息,多种法律语境复现法律英语词汇。第四,结合法律英语词汇特征和学习者的学习情况,融入初始熟悉度因子综合计算词汇优先级,优化近义词推荐策略。

对于资源建设和词汇推荐策略,本研究邀请专家通过访谈的形式验证了设计的合理性和有效性。对于功能设计部分,通过对学习者进行问卷调查和深度访谈的方式验证了本系统在认知负荷相似的情况下,在学习目标达成方面优于现有的词汇学习系统。

本研究设计的法律英语词汇学习系统抓住了法律英语词汇的特征和教学重难点,发挥移动学习的优势,克服了纸质资源的局限性,作为课堂教学的有益补充,帮助学习者高效学习法律英语词汇。

外文摘要:

with the increase of china’s foreign legal affairs, legal english plays a more important role. if students want to learn legal english well, above all, they are supposed to know the specialty of legal english, which mainly reflected by its vocabulary. this study aims to design a legal english vocabulary learning system with the help of mobile learning.

when it comes to the efficient learning of legal english vocabulary, there are still several problems. first of all, learning resources have not been integrated effectively. the number of vocabularies in paper textbooks is limited; current systems either label attribute or make the association of synonyms. secondly, current systems fail to meet the learning depth of legal english vocabulary and to design the discrimination of synonyms as a key point. thirdly, the review content of current systems is unable to meet the review needs of those students who have different learning ives. and the review form of current systems is only based on examinations, which is prone to make students feel boring. fourth, the vocabulary recommendation strategy of current systems is not suitable for legal english vocabulary. the difficulty grading system ignores the influence of the learner's previous english level. in addition, the recommendation of synonyms can’t be adjusted dynamically.

this study tries to solve these problems by following ways. first, this study multi-dimensionally integrates learning resources and labels the attribute of legal english vocabulary to meet the needs of different students at the learning portal. second, this study selects the learning content according to the characteristics of legal english vocabulary, trying to highlight the difficulties of legal english vocabulary teaching. third, this study designs a variety of questions to meet the different review needs of learners and makes learned vocabulary repeat in different legal contexts to stimulate students’ interests. fourth, this study takes students’ former english level into consideration to make an effective learning sequence for legal english vocabularies and different study strategies for synonyms.

this study invites experts to evaluate the resource integration part and vocabulary recommendation part by interviews, which verifies the rationality and effectiveness of the design. in addition, this study conducts an experiment on legal english vocabulary learners by questionnaires and interviews, which shows that the cognitive load of the system designed in this study is similar to that of the existing systems, but the achievement of learning goals is more conductive.

the legal english vocabulary learning system designed in this study captures the characteristics of legal english vocabulary and teaching key points, overcomes the limitations of paper resources by virtue of the advantages of mobile learning, and serves as a useful supplement to in-class teaching, helping students to learn legal english vocabulary efficiently.

分类号:

 H08    

论文总页数:

 69    

参考文献总数:

 64    

参考文献列表:
曹飞. 2007. 法律英语教学的基本定位及其案例教学法[j]. 黑龙江省政法管理干部学院学报, (04):124-125.
陈庆柏. 2006. 涉外经济法律英语. 北京:法律出版社.
杜金榜. 2004. 法律语言学. 上海:上海外语教育出版社.
龚德英. 2009. 多媒体学习中认知负荷的优化控制[d]. 西南大学.
侯萍英. 2010. 法律英语文本的结构和词汇特点[j]. 法制与社会, (24):226+237.
侯天友. 2009. 浅谈法律英语的词汇特点[j]. 读与写(教育教学刊), 6(04):31-32.
黄振中, 夏扬. 2010. 法律英语教学的困境与改革[j]. 中国大学教学, (04):48-51.
寇俊瑜. 2013. 基于法律英语词汇特点的法律英语词汇翻译策略研究[j]. 高教学刊, (15):188-189+193.
冷帅, 苏晓凌, 董燕清, 栾姗, 刘克江. 2017. 中国涉外法律服务业探析(上)[j]. 中国律师, (05):73-76.
李冰. 2007. 语义场理论与法律英语词汇教学[j]. 科教文汇(上旬刊), (06):68-69.
李剑波. 2003. 论法律英语的词汇特征[j]. 中国科技翻译, (02):16-21.
李克兴, 张新红. 2006. 法律文本与法律翻译. 北京:中国对外翻译出版公司.
李勤. 2011. 试论需求分析理论框架下的大学英语esp教学[j]. 云南财经大学学报(社会科学版), 26(04):146-147.
李艳燕. 2015. 法律英语词汇教学策略研究[j]. 当代教育实践与教学研究, (06):53+52.
刘雪洁. 2016. 图式理论框架下高中英语同义词学习困难原因的实证研究[d]. 哈尔滨师范大学.
刘瑶. 2014. 语义场理论在初中英语词汇教学中的应用研究[d]. 南京师范大学.
马雯. 2013. 法律英语词法特征探微──同(近)义词的并置及翻译[j]. 佳木斯教育学院学报, (11):410-412.
庞维国. 2011. 认知负荷理论及其教学涵义[j]. 当代教育科学, (12):23-28.
荣岩. 2018. 医学英语词汇学习系统研究与设计[d]. 北京大学.
阮绩智. 2009. esp需求分析理论框架下的商务英语课程设置[j]. 浙江工业大学学报(社会科学版), 8(03):323-327+344.
沙丽金. 2005. 法律英语教学中的图式理论应用[j]. 郑州航空工业管理学院学报(社会科学版), (03):115-116.
沙丽金. 2005. 以语境理论为基础的法律英语词汇教学[j]. 浙江万里学院学报, (03):155-157.
束定芳. 2000. 现代语义学. 上海:上海外语教育出版社.
司国东, 宋鸿陟, 赵玉. 2013. 认知负荷理论基础上的移动学习资源设计策略研究[j]. 中国远程教育, (09):88-92.
宋雷, 张绍全. 2010. 英汉对比法律语言学:法律英语翻译进阶. 北京:北京大学出版社
宋雷. 2011. 法律术语翻译要略:正确使用法律英语同义、近义词语. 北京:中国政法大学出版社.
孙崇勇, 刘电芝. 2013. 认知负荷主观评价量表比较[j]. 心理科学, 36(01):194-201.
王璐. 2013. 基于本体的个性化推荐系统[d]. 电子科技大学.
王青梅. 2003. 法律英语教学模式的探索——以案例教学法为例[j]. 宁波大学学报(教育科学版), (05):111-112+139.
杨惠中. 2002. 语料库语言学导论. 上海:上海外语教育出版社.
杨硕, 李想, 刘红蕾. 2018. 基于esp词汇分级系统下的大学英语教学研究——以林业科学相关专业为例[j]. 科教导刊(中旬刊), (09):59-61+74.
杨彦军, 郭绍青. 2012. e-learning学习资源的交互设计研究[j]. 现代远程教育研究, (01):62-67.
叶家春,曾杰. 2016. 英语词汇教学的多模态—认知策略模式[j]. 教育评论, (08):127-130.
张法连. 2008. 法律英语词汇研究. 北京:中国方正出版社.
张法连. 2013. 法律英语术语双解. 北京:中国法制出版社.
张海征. 2016. 高校法律英语教学的现状和对策[j]. 课程教育研究, (07):116-117.
张金会. 2006. 法律英语中的案例教学[j]. 吉林农业科技学院学报, (04):56-58.
赵艳平. 2015. 高中英语词汇自适应学习系统的设计与开发[d]. 山东师范大学.
周红. 2008. 论法律英语的词汇特征[j]. 广西政法管理干部学院学报, (03):119-123.
bartlett f.c. 1932.remembering: a study in experimental and social psychology. london: cambridge university press.
brunken r., plass j. l., leutner d. 2003. direct measurement of cognitive load in multimedia learning. educational psychologist, 38(1), 53-61.
carrell p.l., eisterhold j.c. 1983. schema theory and esl reading pedagogy[j]. tesol quarterly, 17(4):553-573.
chen c.m., chung c. j. 2008. personalized mobile english vocabulary learning system based on item response theory and learning memory cycle[j]. computers & education, 51(2):0-645.
ellis h. 1965.the transfer of learning[m]. newyork: macmillan.
ho c.f. 1992. a method of automatic adjusting the timing of review based on a microprocessor built-in device for user to memory strings. tw, 200519622, 1992-12-4
hsieh, t.c., wang t.i., su c.y., lee m.c. 2012. a fuzzy logic-based personalized learning system for supporting adaptive english learning. educational technology & society, 15 (1), 273–288.
hutchinson t, waters a. 1993. english for special purposes: a learning-centered approach[m]. cambridge: cambridge university press.
kalyuga s., chandler p., sweller j. 1999. managing split‐attention and redundancy in multimedia instruction. applied cognitive psychology: the official journal of the society for applied research in memory and cognition, 13(4), 351-371.
kilgarriff a., husák m., mcadam k., rundell m., rychlý p. 2008. gdex: automatically finding good dictionary examples in a corpus. in proc. euralex. barcelona.
krashen s. d. 1985. the input hypothesis: issues and implication. london: longman.
laufer b. 2006. comparing focus on form and focus on forms in second-language vocabulary learning. canadian modern language review, 63(1), 149-166.
laufer b., paribakht t.s. 1998. the relationship between passive and active vocabularies: effects of language learning context[j]. language learning, 1998, 48(3):365-391.
mellinkoff, d. 1963. the language of the law. boston: little, brown and company.
nagy w. 1995. on the role of context in first- and second-language vocabulary learning. vocabulary deion acquisition & pedagogy, 24.
nation p. 1982. beginning to learn foreign vocabulary: a review of the research. relc journal, 13(1), 14-36.
nation p. 1990. teaching and learning vocabulary. boston: heinle & heinle.
nation p., waring r. 1997. vocabulary size, text coverage and word lists. in schmitt, n, mccarthy m. (eds.), vocabulary deion, acquisition, pedagogy.
nist s l., olejnik s. 1995. the role of content and dictionary definitions on varying levels of word knowledge [j]. reading research quarterly, 172-193.
paas f.g., van merriënboer j.j., adam, j.j. 1994. measurement of cognitive load in instructional research. perceptual and motor skills, 79(1), 419-430.
richard j c. 1969. a psycholinguistic measure of vocabulary selection[j]. iral, 8(2):87-102.
richterich r., chancerel l. 1980. identifying the needs of adults learning a foreign language[m]. oxford: pergamon press.
sinclair j., renouf a. 1988. a lexical syllabus for language learning. in carter, r. and mccarthy, m., editors, vocabulary and language teaching. london: longman, 140–60.
swain m. 1993. the hypothesis: just speaking and writing aren't enough[j]. the canadian modern language review 50:158-164.
sweller j. 1988. cognitive load during problem solving: effects on learning. cognitive science, 12(2), 257-285.
公开日期:

 2019-06-04    

基于思考帽理论的合作探究教学设计与实证.陈钗平

链接

题名:

 基于思考帽理论的合作探究教学设计与实证    

姓名:

 陈钗平    

学号:

 1601210453    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 姚亚芝    

导师1单位:

 软件与微电子学院    

导师2姓名:

 俞敬松    

导师2单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-27    

关键词:

 六顶思考帽 支架式理论 合作探究教学 思辨能力    

论文摘要:

信息时代,合作探究教学模式能够给予学生更充分的思考空间和更丰富的思维训练,逐渐成为教育教学中的热门研究话题。在合作探究教学中,主要由学生小组合作展开讨论学习,教师辅助进行指导。众多教育研究者提出合作探究教学是促进学生思辨能力的有效方式。

然而,该模式在实际教学中面临了一些问题。笔者作为《翻译技术原理与实践》课程的助教,发现目前的合作探究教学存在三个问题:(1)如果缺少学生对彼此研究内容的评判或教师对学生发言观点的评价,难以培养学生思维的严谨性;(2)学生不能积极给予同伴建议或教师未能及时提供启发,将不利于训练学生思维灵活性;(3)若学生之间没有情绪表达沟通或教师未能提供积极的鼓励,容易导致学生的思维自主性难以被触动。

针对上述问题,笔者阐述了在现有合作探究教学模式中引入六顶思考帽理论的必要性,明确了教学研究思路和研究方法,创新性地提出了基于六顶思考帽理论的合作探究教学模式。新模式主要包括:学生借助思考帽讨论流程强化小组讨论中的同伴互助作用,教师利用思考帽进行评价反馈增加小组讨论中的教师辅助引导作用。

为验证新模式的有效性,笔者于2018年3月至6月对北京大学外国语学院的14名学生开展了教学实验,实验类型为单组前后测,前后测数据来源为学生的线上讨论记录和问卷调查,并通过定量分析和定性分析对教学实验前后测数据进行对比,验证基于思考帽理论的合作探究教学方法在实际课堂应用中的效果。

实验结果表明,学生的思辨能力在灵活性、严谨性和自主性方面有提高,在一定程度上证明基于思考帽理论的合作探究教学对学生思辨能力提升有积极效果。该研究对合作探究教学模式进行了探索,为其进一步优化提供了新思路。

分类号:

 TP3    

论文总页数:

 60    

参考文献总数:

 51    

参考文献列表:
高建凤. 体验—探究性教学在高师公共心理学教学中的应用研究[D]. 山东师范大学, 2008.
何克抗. "建构主义的教学模式, 教学方法与教学设计." 北京师范大学学报: 社会科学版 5 (1997): 74-81.
黄伟. 六顶思考帽在初中历史教学中的应用[J]. 辽宁教育研究, 2005 (9): 95-96.
李纯. 小组合作探究学习模式在初中信息技术教学中的应用研究[D]. 陕西师范大学, 2017.
李静. 基于核心素养的“支架式”教学在中学化学中的应用研究[D]. 河北师范大学, 2018.
刘芳. 自主合作探究学习模式下初中快速作文的研究与应用[D]. 东北师范大学, 2006.
钱海锋. 化学教学中的合作探究学习研究[D]. 苏州大学, 2007.
丘丽红. 支架理论在高职英语写作教学中的应用研究[D]. 闽南师范大学, 2018.
钱雯. 论建构主义理论及支架式教学法在对外汉语初级阶段口语课堂教学中的设计和应用[D]. 河北师范大学, 2012.
时培忠. 基于支架理论的高中英语写作教学研究[D]. 陕西理工大学, 2018.
王薇. 中小学生的创造性思维训练研究[D]. 河北师范大学, 2014.
吴文文. 试析高中生的历史批判性思维及其培养模式[D]. 温州大学, 2012.
王晓艳. 高中思想政治课自主·合作·探究学习的整合研究[D]. 河北师范大学, 2017.
俞敬松, 王华树. 计算机辅助翻译硕士专业教学探讨[J]. 中国翻译, 2010 (3): 38-42.
俞敬松, 陈泽松. 浅析 MOOC 与翻转课堂在“翻译技术实践”课程中的应用[J]. 工业和信息化教育, 2014(11):17-28.
于军烨. 支架式教学在小学科学课中的应用研究[D]. 聊城大学, 2018.
张威. 支架式教学理论在对泰汉语课堂教学中的应用[D]. 广西师范大学, 2018.
祝学英. 自主合作探究学习方式下初中阅读教学的实践与研究[D]. 东北师范大学, 2010.
周艳. 高中英语新手和专家型教师师生互动话语支架比较研究[D]. 宁波大学, 2018.
赵英芳. 促进学生创新思维发展的教学策略研究[D]. 上海师范大学, 2006.
Ahlam, ES, and Hala Gaber. "Impact of Problembased Learning on Student's Critical Thinking Dispositions, Knowledge Acquisition and Retention." Journal of Education and Practice 5, no. 14 (2014): 74-83.
Bell, Thorsten, Detlef Urhahne, Sascha Schanze, and Rolf Ploetzner. "Collaborative Inquiry Learning: Models, Tools, and Challenges." International journal of science education 32, no. 3 (2010): 349-77.
Cioffi, Jane Marie. "Collaborative Care: Using Six Thinking Hats for Decision Making." International journal of nursing practice 23, no. 6 (2017): e12593.
Edel-Malizia S, Brautigam B, Bittner K, et al. Investigating Interactive Video Assessment Tools for the Flipped Classroom[J]. 2015.
Ge, Xun, and Susan M Land. "Scaffolding Students’ Problem-Solving Processes in an Ill-Structured Task Using Question Prompts and Peer Interactions." Educational Technology Research and Development 51, no. 1 (2003): 21-38.
Hmelo-Silver, Cindy E. "Problem-Based Learning: What and How Do Students Learn?". Educational psychology review 16, no. 3 (2004): 235-66.
Iwaoka, Wayne T, Yong Li, and Walter Y Rhee. "Measuring Gains in Critical Thinking in Food Science and Human Nutrition Courses: The Cornell Critical Thinking Test, Problem-Based Learning Activities, and Student Journal Entries." Journal of Food Science Education 9, no. 3 (2010): 68-75.
Jahanzad, Farzaneh. "Influence of the Deeper Scaffolding Framework on Problem-Solving Performance and Transfer of Knowledge." Oklahoma State University, 2012.
Karadag, Mevlude, Serdar Saritas, and Ergin Erginer. "Using The'six Thinking Hats' Model of Learning in a Surgical Nursing Class: Sharing the Experience and Student Opinions." Australian Journal of Advanced Nursing, The 26, no. 3 (2009): 59.
Kim, Nam Ju. "Enhancing Students’ Higher Order Thinking Skills through Computer-Based Scaffolding in Problem-Based Learning." (2017).
Loparev, Anna. The Impact of Collaborative Scaffolding in Educational Video Games on the Collaborative Support Skills of Middle School Students. University of Rochester, 2016.
Masek, Alias, and Sulaiman Yamin. "The Impact of Instructional Methods on Critical Thinking: A Comparison of Problem-Based Learning and Conventional Approach in Engineering Education." ISRN Education 2012 (2012).
McLoughlin, Catherine. "Learner Support in Distance and Networked Learning Environments: Ten Dimensions for Successful Design." Distance Education 23, no. 2 (2002): 149-62.
Mennin, S, P Gordan, G Majoor, and HA Osman. "Position Paper on Problem-Based Learning." Education for health (Abingdon, England) 16, no. 1 (2003): 98-113.
Newmann, Fred M. "Higher Order Thinking in the Teaching of Social Studies: Connections between Theory and Practice." Informal reasoning and education (1991): 381-400.
Ogden, Conswellor Denise. "Skype as a Scaffolding Tool for Underprepared Freshmen English Composition Students." (2015).
Quinn, Margaret F, Hope K Gerde, and Gary E Bingham. "Help Me Where I Am: Scaffolding Writing in Preschool Classrooms." The Reading Teacher 70, no. 3 (2016): 353-57.
Rosenshine, Barak, and Carla Meister. "The Use of Scaffolds for Teaching Higher-Level Cognitive Strategies." Educational leadership 49, no. 7 (1992): 26-33.
Savery, John R. "Overview of Problem-Based Learning: Definitions and Distinctions." Essential readings in problem-based learning: Exploring and extending the legacy of Howard S. Barrows 9 (2015): 5-15.
Saye, John W, and Thomas Brush. "Scaffolding Critical Reasoning About History and Social Issues in Multimedia-Supported Learning Environments." Educational Technology Research and Development 50, no. 3 (2002): 77-96.
Schellens, Tammy, Hilde Van Keer, Bram De Wever, and Martin Valcke. "Tagging Thinking Types in Asynchronous Discussion Groups: Effects on Critical Thinking." Interactive Learning Environments 17, no. 1 (2009): 77-94.
Schmidt, Henk G. "Foundations of Problem‐Based Learning: Some Explanatory Notes." Medical education 27, no. 5 (1993): 422-32.
Semerci, Nuriye. "The Effect of Problem-Based Learning on the Critical Thinking of Students in the Intellectual and Ethical Development Unit." Social Behavior and Personality: an international journal 34, no. 9 (2006): 1127-36.
Şendağ, Serkan, and H Ferhan Odabaşı. "Effects of an Online Problem Based Learning Course on Content Knowledge Acquisition and Critical Thinking Skills." Computers & Education 53, no. 1 (2009): 132-41.
Smith, Mike, and Kathryn Cook. "Attendance and Achievement in Problem-Based Learning: The Value of Scaffolding." Interdisciplinary Journal of Problem-based Learning 6, no. 1 (2012): 8.
Topping, Keith J. "Trends in Peer Learning." Educational psychology 25, no. 6 (2005): 631-45.
Van de Pol, Janneke, Monique Volman, and Jos Beishuizen. "Scaffolding in Teacher–Student Interaction: A Decade of Research." Educational psychology review 22, no. 3 (2010): 271-96.
Wass, Rob, Tony Harland, and Alison Mercer. "Scaffolding Critical Thinking in the Zone of Proximal Development." Higher Education Research & Development 30, no. 3 (2011): 317-28.
Wood, David, Jerome S Bruner, and Gail Ross. "The Role of Tutoring in Problem Solving." Journal of child psychology and psychiatry 17, no. 2 (1976): 89-100.
Yew, Elaine HJ, and Karen Goh. "Problem-Based Learning: An Overview of Its Process and Impact on Learning." Health Professions Education 2, no. 2 (2016): 75-79.
Ziadat, Ayed H, and Mohammad T Al Ziyadat. "The Effectiveness of Training Program Based on the Six Hats Model in Developing Creative Thinking Skills and Academic Achievements in the Arabic Language Course for Gifted and Talented Jordanian Students." International Education Studies 9, no. 6 (2016): 150.
公开日期:

 2019-06-04    

自适应英语写作系统社交模块的设计与实践.陈陟

链接

题名:

 自适应英语写作系统社交模块的设计与实践    

姓名:

 陈陟    

学号:

 1601210476    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 姚亚芝    

导师1单位:

 北京交通大学    

导师2姓名:

 高志军    

导师2单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-27    

外文题名:

 Adaptive English Writing System Social Module Design And Practice    

关键词:

 大学英语写作 过程写作教学 协作学习 学习动机 结构化研讨    

外文关键词:

 English writing teaching Collaborative learning theory Motivation theory Structured research method    

论文摘要:

随着全球化的深入,英语写作变得愈发重要。学习者对于提升写作能力的诉求也越来越强烈,由于传统写作教学对写作成果的重视程度远大于其写作过程,写作能力的提升也并非一蹴而就,因此写作成为了众多学生英语学习的弱项。近年来随着在线学习的普及以及过程写作理论、协作学习等理论的推广,越来越多的学者开始重视写作过程,提出了群组讨论和同伴互评等协作学习方式,并将其应用到各类在线英语写作教学和英语写作学习系统中。然而,现有的在线英语写作教学、学习系统并未从英语写作学习的实践出发,群组讨论阶段存在讨论跑题、积极性不足等问题;同伴反馈阶段存在互评质量差,参与意愿低等问题,学生的写作能力提升缓慢。

本文以二语写作理论、协作学习理论、激励理论及结构化研讨方法为依据,分析了现有写作教学模式和竞品在社交化模块上的优势与不足。针对其中存在的问题以及大学生群体的写作需求,结合社交化模块的评价标准,对自适应英语写作系统的社交化模块进行了设计。通过建立结构化的讨论社区,提高讨论的质量,培养学生的思维能力,降低学生在写作过程当中的无助感和焦虑感。通过建立结构化的同伴互评方式,帮助学生建立批改思路,获得多元化、高价值的反馈意见。通过奖励等机制的设计,提升学生在学习过程中的参与感和满足感,激发学生的参与意愿。

由于系统中涉及的社交化机制较多,本文在此不一一详述,拣选两个具有代表性的功能——创意广场和同伴互评进行了研究和探讨,并针对这两个功能提出了不同的设计方案。此外还对系统中其他模块的设计进行了简单的介绍。

基于以上设计,本研究选取了南京某高校20名非英语专业学生进行了教学实验。通过实验观察、数据分析、深度访谈、调查问卷等方式,论证了创意广场和同伴互评设计在提高群组讨论质量,降低学生无助感和焦虑感,培养学生思维能力,提升讨论积极性以及在提高同伴互评的质量,提升学生评判性思维能力、认知能力以及参与意愿等方面大有裨益,并筛选出了这两个功能的最优方案。

本研究中自适应英语写作系统的社交化设计弥补了在线英语写作协作化学习方面的不足。在群组讨论阶段,讨论质量及学生们的思维能力均有所提高,缓解了学生写作时的焦虑和畏难情绪。在同伴互评阶段,提升了互评的质量,满足了学生对于多元化,高价值反馈的需求,提升了学生的能力,激发其参与反馈的意愿,补足了现有英语写作协作化学习设计的短板,对英语写作移动教学和协作化学习有一定的参考价值。

外文摘要:

As globalization continues to develop, English writing becomes all the more important, which leads to stronger desire of English learners to improve their writing skills. Traditional teaching of writing, however, emphasizes results over processes, leaving writing skills that cannot be cultivated overnight a weak link for college students. In recent years, with online learning, the process theory of composition and the collaborative learning theory gaining popular, more scholars than before pay attention to the writing process. They put forward collaborative learning approaches such as group discussion and peer review, and also apply these approaches on various online English writing systems. However, existing online English writing systems are not practical. Problems occur such as off-the-topic or poorly motivated discussions, low quality of peer evaluation and low willingness to participate. All these lead to slowly improved writing ability.

Based on L2 writing theory, collaborative learning theory, motivation theory and structured research method, this thesis analyzed common teaching methods of English writing and current teaching products. The author applied standards and contents of the collaborative learning methods and then designed systemic social modules targeting the problems mentioned above and needs of college students for writing. This thesis hopes to improve the quality of discussions and train students' critical thinking via structured discussion groups so as to reduce their helplessness and anxiety in the writing process. Structured peer review and intelligent recommendation of reviewing peers can help students set up the correction ideas and get diversified and valuable feedback. Besides, with rewards and other systems, students’ would be more satisfied and motivated to participate in the discussion.

As there are loads of social mechanisms involved in the system, this thesis was not able to cover all of them and thus selected two core functions for research, namely the creative square and peer review. This thesis includes different design proposals and brief introduction of ideas of designing other modules in the system.

In this thesis, 20 non-English majors from a university in Nanjing were invited in the experiments. Through experimental observation, data analysis, in-depth interview and questionnaires, the author proves creative square and peer review can improve the quality of group discussions and reduce students’ helplessness and anxiety. On top of that, they also help to cultivate students’ critical thinking, motivate students to discuss and promote the quality of peer review and students’ willingness to participate. In the thesis, the best solution of each function was also presented.

In this study, the social design of the adaptive English writing system optimized the process of collaborative learning, improved the quality of discussion, developed the students' critical thinking and relieved the anxiety in the writing process. In the post-writing stage, the quality of mutual evaluation was improved to meet the needs of students to get diversified and valuable feedback. This can inspire students' willingness to provide feedback and make up the weak link of online English teaching of writing, which has certain reference value for the mobile teaching of English writing and cooperative learning.

 

分类号:

 TP3    

论文总页数:

 52    

参考文献总数:

 53    

参考文献列表:
毕劲,秦晓晴等. 2014. 国外英语学术写作研究趋势及其启示[J].外语教学,35(2):45-48.
陈春梅. 2012. 限行六中英语写作教学理论比较分析[J].湖南科技学院学报,33(9):150-152.
蔡宁伟,于慧萍. 2015. 参与式观察与非参与式观察在案例研究中的应用[J].管理学刊,28(4):66-69.
邓鹂鸣,刘红,陈艳等.2004.过程写作法在大学英语写作教学实验中的运用[J].外语教学,25(6):69-72.
代碧薇. 2017. 基于wiki的小组协作式翻译教学研究[D]. 北京大学.
巩潇宁. 2016. ADCS动机模型的应用探究—以小学语文课堂为例[D]. 上海师范大学.
龚晓斌. 2007. 英语写作教学:优化的同伴反馈[J].国外外语教学,(3):47-51.
郭晓英,王宝峰. 2009. 基于网络博客的大学英语写作模式,11(3):1-5.
郭燕,樊葳葳. 2009. 大学英语分层次教学背景下的写作焦虑实证研究,(10):79-84.
郭有松,谭良. 2017. 移动协作学习的个性化分组策略研究[J].中国远程教育,(8):21-28.
贺学贵.2010.“过程写作法”在英语写作教学中的应用[J].黄冈师范学院学报,(2):89-91.
黄渐法.2017.基于小组合作的英语写作教学探索[J].西部素质教育,3(22):223-224.
侯彩静,苏鹏等. 2017. 同伴评价在大学英语过程写作教学中的应用[J]. 大学教育,(1):105-106.
侯彩静,苏鹏等. 2017. 同伴评价对英语写作能力培养的影响[J]. 山西大同大学学报,29(5): 98-99.
何芳,王伟等.2013基于网络的英语学习研究[M].知识产权出版社
蒋云华,罗乐.2014.基于大学英语写作教学中的批判性思维培养[J].校园英语(中旬),(9):4-5.
康霞.2017.英语写作教学理论与实践研究[M].北京邮电大学出版社.
连秀萍. 2012. 合作学习对大学生英语写作的影响[J]. 西南农业大学学报(社会科学版),10(8):198-202.
兰良平,韩刚.2014.英语写作教学:课堂互动性交流视角[M].外语教学与研究出版社.
刘黄玲子,黄荣怀.2002.协作学习评价方法[ J].现代教育技术.(1):24-29,76
刘凤娇. 2011. 激励性评价策略下“写长法”在高中英语写作教学中的应用研究[D]. 山东师范大学.
楼荷英,陈阳明等. 2008. 运用网上辅导和师生论坛的写作教学研究[J].Foreign Language World,(4):41-47.
刘志强. 2007. 英语写作教学中创造性思维的培养[J].黑龙江教育学院学报,26(7):153-154.
刘芳琼. 2007. 大学英语学习中的心理障碍分析与对策[J].南京师范高等专科学校学报,24(2):90-93.
梁茜.2003.切实有效的实施写作过程教学法—英语写作教学模式新探[J].成都师范学院学报,19(9):53-55
倪清泉. 2009. 网络环境下基于写作学习的大学英语写作教学研究[J].外语电化教学,(127):63-68.
曲巍巍. 2016. 基于自动评分系统的协作式大学英语写作教学实证研究 [J].亚太教育,(32):110-111.
涂志云.2011.激励策略在大学英语词汇教学中的应用[J].文教研究,(2)
吴育红,顾卫星. 2011.合作学习降低非英语专业大学生英语写作焦虑的实证研究[J]. 外语与外语教学,(6):51-55.
王晓芳. 2012. 基于微博的大学英语协作式写作教学研究[D]. 河南师范大学.
王永红. 2007. 同伴在线作文互评浅析[J]. 中国水运,5(12):252-253.
王世卿,韩春雷. 2015. 结构化研讨法在研究生教学中的应用研究—以“和谐警民关系”专题为例[J]. 中国人民公安大学学报(自然科学版),(4):89-92.
韦储学. 2008. 建构主义理论及其对大学英语写作教学的启示[J].高教论坛,(3):90-92.
伍新春,管琳. 2010 .合作学习与课堂教学[ M].人民教育出版社.
于夕真. 2007. 写作认知心理过程的研究与博客大学英语写作教学[J].外语电化教育,(116):74-78.
赵建华,李克东.2010.协作学习及其协作学习模式[J].中国电化教育,(10):5-6.
张会萍. 2017. 网上同伴互评在英语写作能力发展中的积极作用[J]. English Teachers,17(22):27-30.
周昕. 2013. 开展结构化讨论,探索党校教学改革新路径[J]. 长江论坛,(6):92-95.
赵翊君,杨跃. 2013. 博客辅助英语专业过程写作教学模式的实证研究,(153):46-51.
周红. 2007. 第二语言写作教学理论研究动态[J].云南师范大学学报,5(6):27-32
Bush J, Zuidema L. 2013. Professional Writing in the English Classroom: Professional Collaborative Writing--Teaching, Writing, and Learning--Together.[J]. English Journal.(102)
Cheng, Y.2004. A Measure of Second Language Writing Anxiety: Scale Development and Preliminary Validation [J]. Journal of Second Language Writin.13(4):313-335.
Castro A A. 1999. Structuring the discussion of scientific papers. Randomised controlled trial of structured discussions is needed[J]. Bmj, 319(7209):581.
Dillenbours.2000.Collaborative learning: cognitive and computational approaches[J]. Computers & Education,35(1):83-86.
Grabe, W., & Kaplan, R. B.1996. Theory and Practice of Writing: An Applied Linguistic Perspective[J]. Applied Linguistics and Language Study London: Longman
Harrer A. 2004. Analysis and Intelligent Support of Learning Communities in Semi-Structured Discussion Environments[M].Artificial Intelligence Applications and Innovations
Kepner,C.G.1991.An experiment in the relationship of types of written feedback to the development of second language writing skills[J]. The Modern Language Journal,(75):305-313
Grabe, W., & Kaplan, R. B.1996. Theory and Practice of Writing: An Applied Linguistic
Schaffert,S etc.2006 Learning with Semantic Wikis[C].Proceedings of the First Workshop on Semantic Wikis-From Wiki To Semantics:109-123
Swain,M.1995. Three Functions of Output in Second Language Learning [A].// G.Cook & B. Seidlhofer (Eds.) Principle and Practice in Applied Linguistics [C].Oxford : Oxford University Press
Suwantarathip O , Wichadee S. 2014. The Effects of Collaborative Writing Activity Using Google Docs on Students' Writing Abilities[J]. Turkish Online Journal of Educational.13(2):148-156.
Seyyedrezaie Z S , Ghonsooly B , Shahriari H , et al. 2016. A MIXED METHODS ANALYSIS OF THE EFFECT OF GOOGLE DOCS ENVIRONMENT ON EFL LEARNERS’ WRITING PERFORMANCE AND CAUSAL ATTRIBUTIONS FOR SUCCESS AND FAILURE[J]. Turkish Online Journal of Distance Education, 17(3).
Zhou W , Simpson E , Domizi D P. 2012. Google Docs in an Out-of-Class Collaborative Writing Activity[J]. International Journal of Teaching & Learning in Higher Education, 24(3):359-375.
公开日期:

 2019-06-05    

面向考试应用的托福积极词汇学习微信小程序的设计.黄郭钰慧

链接

题名:

 面向考试应用的托福积极词汇学习微信小程序的设计    

作者:

 黄郭钰慧    

学号:

 1601210558    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 朱源    

导师单位:

  软件与微电子学院    

第二导师姓名:

 张宏岩    

第二导师单位:

  软件与微电子学院    

答辩日期:

 2019-05-27    

题目(外文):

 The Design of a Wechat Applet on Active Vocabulary for the TOEFL Examination    

关键字(中文):

 托福考试 积极词汇 微信小程序 二语词汇习得    

关键字(外文):

 TOEFL Active vocabulary Wechat Applet Second language vocabulary acquisition    

文摘:

    与普通英语词汇学习相比,托福词汇学习的特征主要表现为学科性强、学习量大和准备周期短。部分学习者对托福词汇学习存在误解,认为只需了解词汇意思即可,但根据托福考试要求,其中一部分核心高频词需要被转换为积极词汇,即需要在听力、写作和口语中熟练运用。调研表明,已有的背托福词汇APP或微信小程序并不注重积极词汇的训练,使得学习者对词汇的理解仅停留在“阅读”层面。
    本研究始于需求调研,以托福词汇教学理论、积极词汇学习理论、第二语言词汇习得理论为基础,以提高学习者学习效率和学习主动性为目的,设计了一款侧重于托福积极词汇学习的微信小程序。与常规词汇学习APP相比,尝试实现了以下改进:首先,在提高学习者学习效率层面:梳理托福常考学科和意群,通过词以类记的方式促进学习者根据学科和意群进行分类记忆;通过词频分析和归纳,统计出托福考试核心高频词作为积极词汇,包括:听力高频词、写作高频词、口语高频词,供学习者学习;利用语块输入、语境输入、联想输入和多感官输入,促进学习者的积极词汇学习输入;结合托福考试的具体考核方式,定制习题内容,促进学习者的积极词汇学习输出。其次,在提高学习者学习主动性层面:结合ARCS模型和认知负荷理论,在界面设计上从“注意力”、“相关性”、“自信心”三个方面提高学习者学习兴趣;并利用微信小程序可与微信群密切结合的独特优势,通过积分鼓励制和陪伴学习制增强学习者的粘性和活力。
    为验证本设计有效性,武汉某大学55名学习者参与了为期3周的教学实验。通过前测(使用微信小程序学习之前)、后测(使用微信小程序学习之后)和延后测(后测完成一周之后)检测学习者的学习效率,并使用满意度调查问卷检验学习者的学习主动性(注意力、相关性、自信心、满意度)。实验结果表明:本研究设计的托福积极词汇学习微信小程序比现有的托福词汇微信小程序更能显著提高学习者的学习效率,但在提高学习主动性方面还需更多探索。

文摘(外文):

    Compared with general English vocabulary building, TOEFL vocabulary building is characterized by involving a variety of disciplines, huge workload and comparatively less time for preparation. Some students have misconception on TOEFL vocabulary building, by false believing that to know the Chinese meaning of the words would be sufficient. Nevertheless, according to the test requirements for TOEFL, students should have the capacity to understand some TOEFL words in listening, writing and speaking tests. Survey shows most apps or Wechat Applets cannot help students to effectively build their active vocabulary and vocabulary building is meant for reading purpose only. 
    Initiating from a requirement survey, this study applies the following theories regarding TOEFL Vocabulary Coaching, Active Vocabulary Building and the Second Language Acquisition in the design of this TOEFL Active Vocabulary building Wechat Applet to improve the learners’ learning efficiency and learning attitude. In comparison with existing alternatives, to begin with, this Wechat Applet attempts to improve learning efficiency in the following ways: 1. Classify TOEFL words according to different disciplines and meaning groups to enable memorization through classified words. 2. Generate a core TOEFL Active Vocabulary word list using a frequency analysis, including most frequently used words for listening, speaking and writing. 3. Combine the input of lexical chunks, the input of context, the input of association and the input of multi-sensory integration to improve students’ learning efficiency in terms of TOEFL Active Vocabulary input. 3. Tailor the testing methods according to the specific requirements of TOEFL examination to enhance the learners’ TOEFL Active Vocabulary output. Subsequently, the current Wechat Applet uses the ARCS model and the Cognitive load theory to improve students’ learning attitude in three clear areas: attention, relevance and confidence. This study also engages students through motivational systems of points-accumulation as well as the use of WeChat groups. 
    An empirical experiment that lasted for three weeks was conducted in a Wuhan university. 55 first grade students tested the validity of the current Wechat Applet. The learning efficiency of the students were examined using three tests, namely, a pre-test(before using the Wechat Applet), post-test(after using the Wechat Applet) and delayed-test(one week after the post-test), whilst their learning attitudes were investigated through use of a satisfaction questionnaire. In conclusion, the Wechat Applet studied here can significantly improve students’ learning efficiency compared with existing apps in the market; nonetheless, further exploration should be made into improving students’ learning attitude.

分类号:

 G43    

论文总页数:

 68    

参考文献数:

 78    

参考文献:
白新国,刘清堂,徐宁.教育游戏中激励机制的分析与设计[C].教育技术国际论坛,教育技术的创新、发展与服务(下册),武汉,2006:244-248.
陈巧芬.认知负荷理论记忆发展[J].现代教育技术,2007(9):16-19.
池昌海.陈望道全集[M].浙江:浙江大学出版社,2011.
崔艳嫣.接受性词汇量、产出性词汇量和词汇研究深度知识的发展路径及其相关性研究[J].现代外语(季刊),2006,29(4):392-400.
Diller,K.,顾诚.学习外语有最佳年龄吗?[J].国外外语教学,1982(1):1-4.
段士平.国内二语语块教学研究述评[J].中国外语,2008(4):63-39.
冯巨澜.提高大学习者英语学习中的积极词汇——关于积极词汇的实证研究[D].重庆:重庆大学,2005.
何华清.高中生英语高频词汇水平实证研究[J].中国外语教育,2016,9(4):53-59.
花蓉.外语学习中的词汇记忆问题及对策[J].科教导刊,2015(8):148-150.
类兴艳.书面输出任务对非英语专业学习者产出性词汇能力的影响研究[D].南昌:江西师范大学,2016.
李健民.英语词汇的多维研究[M].北京:光明日报出版社,2012.
李青,李莹.移动学习应用中积分激励机制设计[J].北京邮电大学学报(社会科学版),2018,20(3):
104-113.
刘锋.英语词汇移动学习记忆管理软件的设计与实现[D].天津:天津师范大学,2014.
刘淑君,杨仲敏,李长均.词汇记忆类软件对宁夏大学英语专业学习者词汇记忆的影响[J].时代教育,2017(9):208.
卢术娟.词汇形式和意义的强化输入顺序对大学习者英语产出性词汇知识习得的影响[D].成都:四川外国语大学,2017.
陆宇佳.托福阅读理解能力的语言因素影响研究[D].成都:西南交通大学,2018.
卢敏.产出性词汇知识广度的发展特征——基于英语专业学习者书面语的研究[J].外语教学理论与实践,2008(2):10-15.
马晓楠.思维导图在高中英语词汇教学中的应用实证研究[D].长沙:湖南师范大学,2017.
蒙台梭利著,胡纯玉译.发现孩子[M].北京:中国发展出版社,2006.
牛瑞英.合作输出中的任务角色对二语词汇习得作用的一项实验研究[J].山东外语教学,2010(4):3-9.
Olivier & Bowler.丁凡译.多感官学习[M].台北:源流出版社,2000.
潘翠翠,何颖,丁珊珊,王艺,李坤.基于APP的英语词汇记忆实证研究[J].海外英语.2017(10):73-79.
荣岩.医学英语词汇学习系统研究与设计[D].北京:北京大学,2019.
田蕾.新托福与高考英语测试全国卷真实性对比研究[J].外语教育教学,2015(12):157-160.
童淑华.第二语言产出性词汇习得研究[M].吉林大学出版社,2010:21-22.
王海棠.高中英语词汇教学策略探究[J].英语教师,2017(19):66-69.
王清,张必兰.基于增强现实的安卓英语词汇识记软件的设计与实现[J].电脑知识与技术,2014,10(27):31-35.
徐春,章晓辉.学习和记忆的突触模型:长时程突触可塑性[J].自然杂志,2009,31(3):136-141.
徐亮.基于自适应学习模式的大学英语产出性词汇教学研究[D].北京:北京大学,2015.
杨进中.认知负荷理论视角的移动课程教学设计[J].现代远程教育研究,2012(3):86-90.
岳颖莱.究竟是“附带习得”还是“附带学得”[J].新课程学习,2010(4):20-21.
张萍.二语词汇习得研究:十年回溯与展望[J].外语与外语教学,2006(6):21-26.
张若男.词汇记忆APP对于初中英语学习效果提升的探索研究[D]上海:上海师范大学,2018.
赵瑞芬.多感官学习的研究现状与展望[J].生物技术世界,2016(4):286-287.
郑瑞珺.怎样以图式理论指导改善大学习者托福IBT听力教学[J].文教资料,2018(3):225-226.
周小华.基于核心素养下语境教学法在英语词汇教学中的运用[C].中国会议,教育理论研究阅读教学,2019.
周远清.深化教学改革,提高教学质量[N].中国教育报,2006(12).
Bandura, A. On the functional properties of perceived self-efficacy revisited[J]. Journal of Management, 2012a, 38(1): 9-44.
Bandura, A. Self-efficacy: The exercise of control[M]. New York: Freeman, 1997.
Bandura, A. Social foundations of thought and action: A social cognitive theory[M]. Englewood Cliffs, NJ: Prentice-Hall, 1986.
Becker, J. The Phrasal Lexicon[M]. Cambridge Mass: Bole and Newman, 1975.
Brown C., Payne M. E. Five essential steps of processes in vocabulary learning [C]. Paper presented at the TESOL convention, Baltimore. 1994.
Brown, J. I. Reading improvement through vocabulary development: The CPD Formula[C]. In New Frontiers in College-AdtReaditteeath Yearbook of the National Reading Conference: 197-202.
Choi, S., & Clark, R. E. Cognitive and affective benefits of an animated pedagogical agent for
learning English as a second language[J]. Journal of Educational Computing Research, 2006, 34(4),
441-466.
Colakoglu, O., Akdemir, O. Motivational measure of the instruction compared instruction based on the ARCS motivation theory vs traditional instruction in blended courses[J]. Turkish Online Journal of Distance Education, 2010, 11(1): 73-89.
Ebbinghaus, H. Memory: A Contribution to Experimental Psychology[M]. New York: Columbia
University, 1913: 30-89.
Garris, R., Ahlers, R. & Driskell, J. Games. motivation, and learning: a research and practice model[J].
Simulation and Gaming, 2002, 33(4): 441-467.
Ghaffari, M., & Mohamadi, R. The effect of context (humorous vs. Non-humorous) on vocabulary acquisition and retention of Iranian EFL learners[J]. International Journal of Applied Linguistics and English Literature, 2012, 1(6), 222-231.
Halliday, M. A. K. R. Hasan. Language, Context and Text: aspect of language in a social semiotic perspective[M]. Oxford: Oxford University Press, 1985.
Herodotou C., Winters N., Kambouri M. An iterative, multidisciplinary approach to studying digital play motivation: the model of game motivation[J]. Games and Culture: a journal of Interactive Media, 2015, 10(3): 1-20.
Keller. J. M. First principles of motivation to learn and e3-learning[J]. Distance Education, 2008, 29(2), 175-185.
Krashen S. D. The input hypothesis: issues and implications[M]. Addison-Wesley Longman Ltd, 1985.
Krashen S. D. & T. D. Terrell. The natural approach: language acquisition in the classroom[M]. Oxford: Pergamon, 1983.
Kwan, K. N., Ching, H. L., Wai, M. L. The impact of social mobile application on students’ learning interest and academic performance in Hong Kong’s sub-degree education[C]. 2016 International Symposium onEducational Technology(ISET), 2016, 18-22.
Laufer, B. The development of passive and active vocabulary in second language: Same or different?[J]. Applied Linguistics, 1998, 19: 255-271.
Laufer, B. ‘Sequence’ and ‘Order’ in the development of L2 lexis[J]. Some Evidence from Lexical Confusion. Applied Linguistics, 1990, 11(3): 281-296.
Laufer, B. & Hulstijin, J. Incidental vocabulary acquisition in a second language: The construct of task-induced involvement[J]. Applied linguistics, 2011, 22(1): 1-26.
Laufer, B. & Nation, P. A vocabulary-size test of controlled productive ability[J]. Language testing, 1999. 16(1): 33-51.
Lawson, M. J., & Hogben, D. The vocabulary- learning strategies of foreign- language students[J]. Language learning, 1996, 46(1): 101-135.
Lewis, M. The lexical approach[M]. Hove, England: Language Teaching Production, 1993.
Liu, Z. W. A study on the application of Wechat in training[J]. Theory and Practice in Language Studies, 2014, 4(12): 2549-2554.
Loorbach, N., Karreman, J. & Steehouder, M. Adding motivational elements to an instruction manuals: effects on usability and motivation[J]. Applied Research, 2007, 54(3): 343-358.
McCarthy M. Discourse Analysis for Language Teachers[M]. Cambridge University Press, 1991.
McCarthy, M., & Wigglesworth, G. Vocabulary teaching and learning special issue[J]. Prospect Journal, 2001. 16(3).
Meara, P. Vocabulary acquisition: A neglected aspect of language learning. Language Teaching and Linguistic[J]. 1980, 13(4): 221-246.
Nagy, W. E., & Herman, P. A. Incidental vs. instructional approaches to increasing reading vocabulary[J]. Educational perspectives, 2003, 23(1): 16-21.
Paas, F., Renkl, A. & Sweller, J. Cognitive load theory: Instructional implications of the interaction between information structures and cognitive architecture[J]. Instructional Science, 2004, 32: 1-8.
Parry, K. Second language vocabulary acquisition: A rationale for pedagogy[J]. Vocabulary and comprehension, 1997: 55.
Schwabe G., Goth C. Mobile learning with a mobile game: design and motivational effects[J]. Journal of Computer-Assisted Learning. 2005, 21(3): 204-216.
Shi, Z. J., Luo, G. F. Application of Wechat teaching platform in interactive translation teaching[C]. International Journal of Emerging Technologies in Learning, 2014, 9(9): 71-75.
Sokmen A. J. Word association results: a window to the lexicon of ESL students[J]. JALT Journal, 1993, 15(2): 135-150.
Sperber, D. & Wilson, D. Relevance: Communication and cognition[M]. Oxford: Blackwell Publishers Ltd, 1989.
Stockwell, G. Vocabulary on the move: Investigating and intelligent mobile phone-based vocabulary tutor[J]. Computer Assisted Language Learning, 2007, 20(4): 3.
Swain, M & Sharon, L. Problems in output and the cognitive process they generate: a step to language learning[J]. Applied Linguistic, 1995, 16(3): 371-393.
Van der Meij, H., van der Meij, J., & Harmsen, R. (2015). Animated pedagogical agents effects on enhancing student motivation and learning in a science inquiry learning environment[J]. Education Tech Research, 2015, 63(3), 381-403.
Van Merrienboer, J. J. G. & Kirschner, P. A. Ten Steps to Complex Learning: A Systematic Approach to Four Component Instructional Design[M]. Mahwah, NJ: Lawrence Erlbaum Associoctes, 2007.
Wesche, Marjorie, and T. Sima Paribakht. Assessing second language vocabulary knowledge: depth versus breadth[J]. Canadian Modern Language Review, 1996, 53(1): 13-40.
Wigfield, A., & Eccles, J. S. Expectancy-value theory of achievement motivation[J]. Contemporary Educational Psychology, 2000. 25(1), 68-81.

公开日期:

 2022-06-03    

出版审校流程中专业审校与目标读者审校的对比研究——以《培养小极客》为例.张心彧

链接

题名:

 出版审校流程中专业审校与目标读者审校的对比研究——以《培养小极客》为例    

姓名:

 张心彧    

学号:

 1601210866    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 李博婷    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-27    

外文题名:

 Comparative Study of Editor's Review and Target Reader's Review During Publishing Process—A Case Study of Bringing Up Geeks    

关键词:

 出版审校流程 专业审校 目标读者审校    

外文关键词:

 Review and Publishing Process Editor’s Review Target Reader’s Review    

论文摘要:

目前,国内翻译研究多集中于译者、译作和翻译策略等方面,审校相关研究相对较少,而对中文版图书的出版审校研究则更为罕见。笔者在翻译《培养小极客》(Bringing Up Geeks)一书的过程中,深入了解了出版社的审校流程,并发现目前部分出版社在出版中文版图书时,会在原有的内部专业审校基础上,增加目标读者审校的环节,从而弥补专业审校的不足,提高中文版图书编校质量,提升读者对中文版图书的阅读体验。

因此,本文基于对《培养小极客》的翻译和审校过程中出现的具体案例,以目标读者审校为研究对象,以明确目标读者审校的价值为研究目的,首先通过理论与现实结合的方法对审校者多样化与读者“反馈提前化”的可能性进行了论证,并从理论角度分析了目标读者在审校中可能起到的作用,从而明确了目标读者参与中文版图书审校的可行性;其次从审校专业能力、审校目的、审校标准或规范和审校方式四个维度对专业编校人员和目标读者进行对比分析,并通过对《培养小极客》中具体案例的分析,详细探讨了目标读者审校在句义不明、词义不明、文化差异和表达优化四类问题上所发挥的作用;最后结合对专业审校局限性的分析,明确了目标读者审校的作用,即(1)帮助解决专业审校忽视的语义不明问题,增强译本的可理解性;(2)润色文本表达,提升译本的可读性;(3)帮助解决专业审校忽视的文化差异问题,并提出对应的多样化解决方案,提升译本的丰富性。

本文证明,在中文版图书出版过程中,目标读者审校是对专业审校的一个有效补充,将其纳入出版社的审校流程,在弥补专业审校的局限性和盲点、提高图书编校质量、增强图书对读者的吸引力和说服力等方面均有一定意义。

外文摘要:

Currently, domestic translation studies mostly focus on translators, translation works, and translation strategies, while researches on review and publishing process are relatively rare. After finishing the translation of Bringing Up Geeks written by American writer Marybeth Hicks, the author of this paper pays more attention to the review and publishing process and finds that the publisher’s review contains not only internal editor’s review but also external target reader’s review. The latter is to make up for the limitations and shed light on the blind spots of the editor’s review, improve the quality of translation, as well as enhance readers’ reading experience.

Combining qualitative analysis with quantitative analysis, this paper analyzes the differences between editor’s review and target reader’s review and systematically clarifies the functions of target reader’s review based on the specific problems occurring during the review and publishing process of Bringing Up Geeks. The analysis shows that the editor’s review and the target reader’s review differ in skills, purposes, standards and methods. Also, the research finds that the target reader’s review can complement the editor’s review in four aspects occurring in the translation: ambiguity of sentences, ambiguity of words, cultural differences and expression optimization, even though the target reader may present wrong or unnecessary suggestions due to subjective factors.

In conclusion, this paper advises that translation revision is a critical process in the publication of translated books. Target reader’s review, as a relatively new mode of review among domestic publishers, presents interesting interactions between the translator, editor and reader. It is an effective supplement to the editor’s review and therefore deserves more attention from both publishers and researchers.

分类号:

 G23    

论文总页数:

 198    

参考文献总数:

 57    

参考文献列表:
[1] 陈玉姣. 引进版图书的翻译现状和对策[j]. 商情, 2017(30):252.
[2] 2017年引进版权汇总表. 中华人民共和国国家版权局[eb/ol]. (2018-10-10)[2019-03-21]. http://www.ncac.gov.cn/chinacopyright/contents/11228/386858.html.
[3] 苏秋丽. 开卷数据|近年国内引进版图书市场分析[eb/ol]. (2016-08-15)[2019-03-21]. https://mp.weixin.qq.com/s/svqdsfzxpdm_bqjka1vqpq.
[4] 闫明. 引进版图书的翻译现状与对策探讨[j]. 学园, 2014(19):12-13.
[5] 李新妞. 如何提升引进版图书的编校质量——以经管类书稿为例[j]. 传播力研究, 2018, 2(28):160-161.
[6] 刘苏华. 出版社编校质量控制模式构建[j]. 现代出版, 2013(2):57-60.
[7] 崔庆喜. 关键在于把“三审三校制”落到实处[j]. 中国出版, 1995(11):17-17.
[8] 李满意. 浅析图书编校质量问题成因和预警管理[j]. 中国编辑, 2017(05):47-52.
[9] 黄健, 王丹. 封面差错面面观:以医学及相关图书为例[j]. 出版广角, 2016(13):50-52.
[10] 尹玉吉. 中西方学术期刊审稿制度比较研究[j]. 浙江大学学报(人文社会科学版), 2012, 42(4):201-216.
[11] 郭力伟. 如何提高引进版图书的编校质量[j]. 新媒体研究, 2017, 3(1):92-93.
[12] 赵玉山, 程晶晶. 出版人职业生存现状调查样本报告(2017—2018年度)[j]. 科技与出版, 2018(10).
[13] 刘澍. 警惕编辑工作中的心理干扰[j]. 编辑学刊, 2006(6):29-31.
[14] 陶范. 析编辑的偏见[j]. 出版发行研究, 2005(12):32-35.
[15] 程静华, 苏克玉, 宁学才,等. 校对规律的研究现状及思考[j]. 中国科技期刊研究, 2003, 14(3):245-247.
[16] 杨娟林. 论心理因素对校对工作的影响[j]. 科学之友:上, 2007(2b):91-92.
[17] 张锋. 出版校对心理学研究[j]. 编辑学刊, 1997(6):30-36.
[18] 赵桂树. 校对工作的心理干扰及其排除[j]. 出版发行研究, 1999(3):23-25.
[19] moustafa k. is there bias in editorial choice? yes[j]. scientometrics, 2015, 105(3):2249-2251.
[20] adin r. dealing with editor’s bias[eb/ol]. (2015-01-14)[2019-03-25]. https://americaneditor.wordpress.com/2015/01/14/dealing-with-editors-bias/.
[21] weller a. potential bias in editorial peer review[j]. the serials librarian, 1991, 19(3-4):95-103.
[22] gilliland s, cortina j. reviewer and editor decision making in the journal review process[j]. personnel psychology, 1997, 50(2):26.
[23] 孙会香. 如何提高图书编校质量[j]. 出版参考, 2011(21):26-26.
[24] 林瑞耕. 科技图书编辑手册[m]. 北京:中国铁道出版社, 2004.
[25] 严安. 读者是编辑工作的核心——浅谈编辑的起源及如何做好新时期编辑工作[j]. 学术论坛,
2010, 33(11):172-174.
[26] 李曙豪. 论编辑活动中“隐含的读者”[j]. 编辑之友, 2002(6).
[27] 钟天明. 编辑读者关系之我见[j]. 出版发行研究, 1991(2):36-38.
[28] 阙道隆. 书籍编辑学概论[m]. 沈阳:辽宁教育出版社, 1995.
[29] 牛正攀. 图书读者反馈机制构建研究[d]. 开封:河南大学, 2010.
[30] müller j, klemens h. movies of the 80s [m]. koln: taschen, 2003.
[31] 王永宁. 校对人员要当好“第一读者”[j]. 传媒观察, 2008(4):61-61.
[32] carney k m. the publisher's reader as feminist: the career of geraldine endsor jewsbury[j]. victorian periodicals review, 1996, 29(2):146-158.
[33] what are beta readers and sensitivity readers?[eb/ol]. (2019-01-18)[2019-03-27]. https://blog.reedsy.com/beta-readers-sensitivity-readers/.
[34] mcmahon m. what is a beta reader?[eb/ol]. (2019-03-29)[2019-04-15]. https://www.wisegeek.com/what-is-a-beta-reader.htm.
[35] mason e. this book is racist damaging rewritten[eb/ol]. (2018-03-19)[2019-04-01]. https://www.washingtonpost.com/graphics/2018/entertainment/books/keira-drake-the-continent-book-comparisons/?noredirect=on&utm_term=.0c0008cc86ba.
[36] mason e. publishers are hiring “sensitivity readers” to flag potentially offensive content [eb/ol]. (2017-02-15)[2019-03-20]. https://www.chicagotribune.com/lifestyles/books/ct-publishers-hiring-book-readers-to-flag-sensitivity-20170215-story.html.
[37] 邱晓伦. 浅谈翻译实践中审校译文的具体原则[j]. 语言与翻译, 2000(1):41-43.
[38] 郑四方, 李征娅. 关联视角下的读者观照与翻译研究[j]. 学术探索, 2012(7):149-151.
[39] 于利伟. 基于目标读者论的《傲慢与偏见》译本研究[j]. 语文建设, 2016(33):67-68.
[40] 张美芳, 陈曦. 巧传信息 适应读者——以故宫博物院网站材料翻译为例[j]. 中国翻译, 2013(4):99-103.
[41] 李小川. 情态意义翻译的读者接受原则研究[j]. 外语教学, 2012(6):101-104.
[42] mohammed f, mohammed a. reader responses in quranic translation[j]. perspectives, 2000, 8(1):27-46.
[43] adomat d. handbook of research on children's and young adult literature (review)[j]. bookbird a journal of international children’s literature, 2011, 49(4):84-84.
[44] 汉斯-赫尔穆特·勒林, 邓西录. 编辑的任务[j]. 中国编辑, 2003(1):83-84.
[45] 图书编校规范简明手册[m]. 西安:西北大学出版社, 2013.
[46] 赵崇岩. 读者阅读心理的研究[j]. 图书馆建设, 2000(4):92-94.
[47] 石宗源. 图书质量管理规定[j]. 印刷质量与标准化, 2005(3):61-64.
[48] 那欣. 技术进步语境下的黑马校对系统运用及其局限性[j]. 新闻传播, 2013(3):185-186.
[49] 杨宇良. 极客是谁?[j]. 软件工程, 2008(12):51-52.
[50] causer c. the geeks have inherited the earth[j]. ieee potentials, 2017, 36(4):8-10.
[51] hicks m. bringing up geeks: how to protect your kid’s childhood in a grow-up-too-fast world[m]. berkley: berkley trade, 2008.
[52] fields a m. the oxford english dictionary[m]. oxford: clarendon press, 1989.
[53] forward s. toxic parents: overcoming their hurtful legacy and reclaiming your life[m]. new york: bantam books, 2002.
[54] kulkarni g, et al. babytalk: understanding and generating simple image deions[j]. ieee transactions on pattern analysis & machine intelligence, 2013, 35(12):2891-2903.
[55] wallis c. the multitasking generation[eb/ol]. (2006-03-27)[2019-03-01]. http://content.time.com/time/classroom/glenfall2006/pdfs/the_multitasking_generation.pdf.
[56] 李国炎. 当代汉语词典[m]. 上海:上海辞书出版社, 2001.
[57] 许宝华, 宫田一郎. 汉语方言大词典[m]. 北京:中华书局, 1999.
公开日期:

 2019-06-04    

京剧回译中的文化还原策略——以《伶界大王:1870-1937年京剧再造时期的演员与公众》为例.汪楚楠

链接

题名:

 京剧回译中的文化还原策略——以《伶界大王:1870-1937年京剧再造时期的演员与公众》为例    

姓名:

 汪楚楠    

学号:

 1501210695    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 李博婷    

导师1单位:

  软件与微电子学院    

论文答辩日期:

 2019-05-27    

关键词:

 京剧 回译 文化还原    

论文摘要:

翻译不仅是语言的解码和编码过程,也是跨文化交际的过程。因而回译对原文的还原和回归,不仅是语言上的还原,也包含文化的还原。本文基于美国作者约书亚•葛以嘉(Joshua Goldstein)《伶界大王:1870-1937年京剧再造时期的演员与观众》(Drama Kings: Players and Publics in the Re-creation of Peking Opera,1870-1937)一书的翻译实践,探讨了原文中存在的文化英译问题,以及在回译时针对不同内容所采取的还原策略。

在对回译研究、文化还原和京剧翻译研究做了简要回顾后,笔者首先分析作者在写作过程中对京剧文化的英译特点,探讨其带来的翻译难点。笔者发现,作者在英译京剧文化术语和专有名词时大量地使用音译和拼音注释。笔者以为这一翻译方式较好地保留了中国文化的异域特色,且相应的解释说明有助于读者了解文化概念的内涵,但同时作者的英译也存在不够准确或不够恰当的情况。其次笔者指出了原文术语模糊翻译、一词多义、人名音译错误和引用来源多元等现象给回译实践造成的难点。

对于本书第二章已有的译文,笔者分析了其存在的问题,主要是词汇还原不准确、还原有误和引文未实现至译的情况。针对这些问题,笔者皆给出了自己的思考及认为更恰当的译法。接下来笔者总结归纳了在回译实践中针对不同情况所采取的还原策略,主要从对词汇和引文的还原两个角度出发。对词汇的还原方法有经过仔细详尽的查证后给出译法、省去不译原文中对于中文读者来说冗余的解释、通过添加文内注释或脚注的形式增添必要的解释说明或背景知识以增强译文的可读性。针对引文的还原,则主要依据是否找到引语原文和引语与原文的吻合程度来进行处理,主要分为按引文原文还原、添加注释说明作者错译现象、笔者自译等方法。

外文摘要:

Translation is not only a process of language decoding and encoding, but also a process of cross-cultural communication. In other words, translation is not only about bilingual transformation, but also about cultural exchange. Therefore, the restoration to the original text in the process of back-translation is not only of language but also of culture. Based on the translation project of Drama Kings: Players and Publics in the Re-creation of Peking Opera, 1870-1937 written by Joshua Goldstein, this paper probes into the translation problems of Peking Opera and the strategies adopted for different cultural contents in the process of restoration.

This paper first summarizes the cultural content involved in the source text, which can be divided into two categories: vocabulary and citation. Then it analyzes the English translation of cultural content in the source text and points out the difficulties it poses to the back-translation, as they require the translator to handle carefully according to different circumstances. The paper then analyzes the problems in the Chinese translation of the second chapter of the source text, including inaccurate lexical restoration, incorrect restoration and imprecise back-translation of citations. In view of these problems, this paper puts forward appropriate translations.

For the two categories of cultural content in the source text, the present paper proposes corresponding restoration strategies. Regarding vocabulary, there are three methods, namely, doing detailed research, omitting redundant information and annotating uncommon cultural concepts. As for citations, if sources can be found, copy the source; if not, translate them on one’s own. This may involve translating in the classical style of Chinese. In this case, Baidu’s classical-Chinese machine translation, still immature but being the only one of its kind, may be applied, with the result balanced by the translator.

分类号:

 H08    

论文总页数:

 35    

参考文献总数:

 43    

参考文献列表:
[1] Colin Mackerras. Review[J]. China Review International, FALL 2007,14(2): 443-446.
[2] Andrea Goldman. Reviews: Scholarship[J]. The Opera Quarterly, 2010,26(2-3):460-470.
[3] 冯庆华,李美. 文体翻译论[M]. 上海:上海外语教育出版社, 2001.
[4] 贺显斌.《回译的类型、特点与运用方法》[J]. 中国科技翻译, 2002(4):45-47.
[5] Mark Shuttleworth, Moira Cowie. Dictionary of Translation Studies[M]. New York:Routledge, 2014:14.
[6] 林煌天.《中国翻译词典》[M]. 武汉:湖北教育出版社, 2005:303.
[7] 陈志杰,潘华凌. 回译——文化全球化与本土化的交汇处[J]. 上海翻译, 2008(3):55-59.
[8] Gideon Toury.In Search of A Theory of Translation [M].Tel Aviv: Porter Institute of Poetics and Semiotics, 1980:23-24.
[9] Richard Brislin. Back-Translation for Cross-Cultural Research[J]. Journal of Cross-Cultural Psychology, 1970,1(3): 185-216.
[10] 王正良. 回译研究[M]. 大连:大连海事大学出版社, 2007:168-215.
[11] 万雪梅. 试论汉学翻译[J]. 南京师范大学文学院学报, 2012(1):84-88.
[12] Edward Tylor. Primitive Culture: Research into the Development of Mythology, Philosophy, Religion, Art, and Custom[M]. London: John Murray. 1871: 1.
[13] 中国社会科学院语言研究所. 新华字典[M]. 北京:商务印书馆, 2004:504.
[14] Nida Eugene. Language and Culture: Context in Translating[M]. Shanghai: Shanghai Foreign Language Education Press, 2007:78.
[15] Peter Newmark. A Textbook of Translation. Shanghai: Shanghai Foreign Language Education Press, 2001:94.
[16] 卞赵如兰. 西方对于京剧研究的情况[C]//中国艺术研究院. 中国戏曲艺术国际学术讨论会论文汇编. 北京:中国戏曲艺术国际学术讨论会秘书处, 1987:319-320.
[17] 荣广润. 地球村中的戏剧互动[M]. 上海:上海三联书店, 2007:48.
[18] 陈思思. 施高德与中国戏曲[J]. 国际汉学, 2017(1):79.
[19] 曹广涛. 基于演出视角的京剧英译与英语京剧[J]. 吉首大学学报:社会科学版, 2011, 32(6):158-160.
[20] 李洁. 不觉来到百花亭——魏莉莎的京剧英译实践和京剧英译观[J]. 东方翻译, 2013(1):63-67.
[21] 曹广涛. 戏曲英译百年回顾与展望[J]. 湖南科技学院学报, 2011, 32(7):142-145.
[22] 张琳琳. 从“青衣”等京剧术语的英译看文化翻译的归化和异化[J]. 上海翻译, 2013(4).
[23] 陈艳华. 京剧中的文化专有项英译研究——以京剧行当名称英译为例[J]. 海外英语:翻译研究, 2016(4):94-95.
[24] 周琰. 从功能对等论看京剧术语及剧名的英译[J]. 大众文艺, 2010(12):111-112.
[25] 董单. 京剧剧名翻译及方法探究[J]. 戏剧之家, 2017(11):269-270.
[26] 孙颖. 对外文化交流视域下京剧专门用途英语翻译实践探析[J]. 四川戏剧, 2016(9):40-44.
[27] 中国大百科全书总编辑委员会《戏曲曲艺》编辑委员会. 《中国大百科全书:戏曲曲艺》[M]. 北京:中国大百科全书出版社, 1992:171.
[28] 黄钧,徐希博. 京剧文化词典[M]. 上海:汉语大词典出版社, 2001.
[29] 吴同宾. 京剧知识手册[M]. 天津:天津教育出版社, 1995.
[30] 夏征农,陈至立. 辞海[M]. 上海:上海辞书出版社, 2009.
[31] 刘畅. 清代宫廷和苑囿中的室内戏台述略[J]. 故宫博物院院刊, 2003(2):80-87.
[32] 张淑娴. 清代皇宫室内戏台场景布局探微[J]. 中华戏曲, 2016(1):49-79.
[33] 侯希三. 北京老戏园子[M]. 北京:中国城市出版社, 1996.
[34] 杜定宇. 《英汉戏剧辞典》[M]. 成都:四川人民出版社, 1990.
[35] 廖奔. 中国古代剧场史[M]. 郑州:中州古籍出版社, 1997:142-143.
[36] 张发颖. 中国戏班史[M]. 沈阳:沈阳出版社, 1991.
[37] 钱南扬. 戏文概论[M]. 北京:中华书局, 2009:203.
[38] 傅瑾. 国剧的脚色、行当与人物[J]. 戏剧艺术, 2000(3):67-75.
[39] 安利. 脚色•戏曲脚色•角色之正名研究[J]. 牡丹江大学学报, 2009(3):22.
[40] 宁翠叶. 体育英语词汇手册[M]. 上海:复旦大学出版社, 2010:159-160.
[41] 李玉昆. 简论戏曲表演技法中“五法”的特征与运用[J]. 戏曲研究, 2013(2):256-264.
[42] 尚小云. 谈四功五法[J]. 戏曲艺术, 1982(2):18-24.
[43] 北京市、上海艺术研究所. 中国京剧史[M]. 北京:中国戏剧出版社, 2000:282.
公开日期:

 2019-06-03    

翻译中的原型效应转移策略探究——以《推和敲》为例.杨舒涵

链接

题名:

 翻译中的原型效应转移策略探究——以《推和敲》为例    

姓名:

 杨舒涵    

学号:

 1601210796    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 李博婷    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-27    

外文题名:

 Strategies for Prototype Shift in English-Chinese Translation: A Case Study of Word by Word    

关键词:

 翻译方法 英汉翻译 原型转移法    

外文关键词:

 Translation methods English-Chinese translation Prototype Shift    

论文摘要:

  本次翻译实践基于美国作家柯丽·斯坦珀所著的《推和敲》(Word By Word)一书。这是一本描绘词典编撰幕后工作的著作,类似的题材在国内外都属小众。笔者在试译此书的过程中发现书中表达看似简单却并不易懂,如果按照传统的直译或是意译往往无法准确传达原文语义,需采用原型转移法进行翻译。迄今为止,原型转移法在翻译研究和实践中的运用不多,学界也还未大量开展这方面的研究,这使得笔者对原型转移法产生兴趣,并以此作为研究主题。

  在展开翻译实践前,笔者利用语料分析工具对原文语言进行分析,探讨其中的翻译难点。笔者发现,首先,作者在行文中大量地采用低频词、生僻词,不少词汇不仅没有现成对应的中文翻译,更有的难寻英文背景信息。其次,作者还会在旧词的基础上创造新义,而且不忌讳粗俗语的使用。基于此,并结合大量现有的翻译实例,笔者提出在使用原型转移法时应遵循语境原则、从主原则、等效原则,采取增减达义法、虚实互换法、巧用流行语和谐音转义法的翻译策略,以保证从语义和文化层面准确传达原文的意思,保证译文质量。

  本研究的意义主要体现在以下三个方面:(1)引入认知科学领域的原型理论,并在此基础上进一步探讨原型转移翻译法,为翻译方法提供一个新的视角和途径,克服传统的对翻译方法非直译便意译的认知模式;(2)分析探究原型转移翻译法适用的场景,有助于打破当前仅限于品牌名、电影名等翻译方向的困境;(3)总结整理三大翻译原则和四大翻译策略,能比较有效地解决各类翻译问题。

  总之,原型理论可为翻译提供一个认识翻译的新角度,为一些翻译议题提出新的见解,对当前的翻译实践和理论研究都具有十分积极的意义。

外文摘要:

This paper discusses translation strategies for prototype shift in English-Chinese translations based on the translation practice of Kory Stamper’s book Word by Word. It is a nonfiction work, explicating the lexicographic details and dilemmas encountered by Stamper as an associate editor in Merriam Webster Company. During the process of analyzing the book, this paper finds that it contains many simple words with obscure meanings, which, if translated in traditional ways, for example, word for word or sense for sense, would not convey the correct meaning of the source text. For this reason, this paper proposes to adopt the prototype shift, a concept that originates from cognitive science, to solve the problem.

First of all, through literature review, this paper finds prototype shift strategy has not been widely adopted in translation practice, nor has it been extensively explored by researchers. Secondly, analysis of existing translation cases leads to three translation principles, emphasizing the linguistic context, the cultural background and the translation effects, and four translation techniques. Finally, this paper compares prototype shift strategy with some easily confused concepts, for instance, the conversion approach and the free translation.

The significance of this study is mainly reflected in the following three aspects: (1) introducing the prototype theory to provide a new perspective and approach for translation; (2) analyzing and exploring the applicable situations for prototype shift; (3) proposing three principles and four techniques to realize prototype shift effectively.

分类号:

 H059    

论文总页数:

 35    

参考文献总数:

 29    

参考文献列表:
[1] Dryden, J. "The Three Types of Translation." Western Translation Theory: From Herodotus to Nietzsche Ed. Robinson, D. Beijing: Foreign Language Teaching and Research Press, 2006:172-174.
[2] Lefevere, Andre, and ping Xia. Translation, Rewriting and the Manipulation of Literary Fame: 翻译、改写以及对文学名声的制控. Shanghai: Shanghai Foreign language Education Press, 2010.
[3] Doherty, Stephen M. "Translation in Transition: Between Cognition, Computing and Technology." Journal of Specialised Translation, 2018:353-355.
[4] Shreve, Gregory M., and Erik Angelone. Translation and Cognition. Amsterdam: John Benjamins Pub. Co., 2010.
[5] 卢卫中, 王福祥. 翻译研究的新范式——认知翻译学研究综述[J]. 外语教学与研究, 2013(4):606-616.
[6] Rosch, E.H.. "Cognitive Representaions of Semantic Categories". Journal of Experimental Psychology: General, 1975:192-233.
[7] Lakoff, George. Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. Chicago: University of Chicago Press, 1987.
[8] 维特根斯坦, 蔡远. 哲学研究[M]. 中国社会科学出版社, 2009.
[9] 朱立元. 当代西方文艺理论[M]. 华东大学出版社, 2008.
[10] Lapsley, Daniel K. and Benjamin Lasky. ''Protypic Moral Character.'' An International Journal of Theory and Research, 2001: 345-363.
[11] Taylor, John R. Linguistic Categorization: Prototypes in Linguistic Theory. England: Clarendon Press, 1989:52-53.
[12] 刘夏. 从原型理论视角分析中英文化中“红色”语义对比[J]. 现代交际, 2019(03):96+95.
[13] 肖群. 基于原型理论对英语动词多义性的认知语义研究[D]. 成都理工大学, 2017.
[14] 夏珺. 基于原型范畴理论的网络新兴词汇研究[J]. 教育教学论坛, 2019(12):203-204.
[15] 藏雅楠,卢绍刚. 原型范畴理论下“云XX”的认知社会语言学研究[J]. 现代语文, 2019(02):135-139.
[16] 霍克斯特. 结构主义与符号学[M]. 瞿铁鹏,译. 上海译文出版社, 1987.
[17] 程雨民. 关于词汇意义[J]. 外语与外语教学, 1999(01):13-14.
[18] 王佐良. 翻译:思考与试笔[M]. 外语教学与研究出版社, 1989.
[19] Kovecses, Zoltan. Language, Mind, and Culture: A Practical Introduction. England: Oxford University Press, 2006.
[20] 龙明慧. 翻译原型研究[D]. 中山大学出版社, 2011.
[21] 李勇. 花非花 雾非雾——翻译中的原型转移效应[J]. 译苑新谭, 2014(1):55-62.
[22] 张培基. 英汉翻译教程[M]. 上海外语教育出版社, 1980.
[23] 陈宏薇. 看似容易,实则不易[J]. 中国翻译, 2008(01):88-90.
[24] 谭卫国, 蔡龙权. 新编英汉互译教程[M]. 华东理工大学大学出版社, 2009.
[25] 徐慧. 从词义表达和词义引申的角度谈英汉翻译[D]. 上海交通大学,2011.
[26] 刘宓庆. 新编当代翻译理论[M]. 中国对外翻译出版公司, 北京, 2012.
[27] Schmitt, N., and Schmitt, D.. ''A Reassessment of Frequency and Vocabulary Size in L2 Vocabulary Teaching.'' Language Teaching, 2014:484-503.
[28] 刘锦.网络热词“直男癌”的建构与颠覆——基于社交媒体女权主义话语符号的分析[J].新闻知识,2017(11):84-87.
[29] Lutzky, Ursula, and A. Kehoe. ''Your Blog is (the) Shit A Corpus Linguistic Approach to the Identification of Swearing in Computer Mediated Communication.'' International Journal of Corpus Linguistics 2016:165-191.
公开日期:

 2019-06-04    

针对英语词汇石化问题的自适应词块系统研究与设计.王丽君

链接

题名:

 针对英语词汇石化问题的自适应词块系统研究与设计    

姓名:

 王丽君    

学号:

 1601210744    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 李博婷    

导师1单位:

 软微    

导师2姓名:

 俞敬松    

论文答辩日期:

 2019-05-27    

外文题名:

 Research and Design of An Adaptive Chunk Learning System for the Ease of English Vocabulary Fossilization    

关键词:

 词汇石化 词块教学法 自适应学习 产出性练习 系统设计    

外文关键词:

 Vocabulary fossilization Lexical approach Adaptive learning Productive exercise System design    

论文摘要:

      语言石化(Fossilization)是中介语(Interlanguage)的特征之一,是指二语学习出现停滞不前甚至倒退的现象。其中,语音、词汇、语法等层面都可能出现石化,而防止或缓解词汇石化现象中的词汇能力石化问题是本研究的核心。大量研究表明,词块(Lexical Chunk)学习能够改善词汇能力石化问题,但传统的词块教学仍有以下三方面的局限:一、无序词块的学习内容忽视了词块间的纵聚合与横组合的语义关系,学习者难以构建词义网络;二、重视记忆过程,缺少产出练习;三、一致的学习内容和方法难以满足学习者的个性化需求。这些局限性容易给学习者造成已会运用词汇的假象,进一步导致词汇能力石化问题。

       为了改善词汇能力石化问题,本文根据词汇石化、语义网络、词块教学法以及二语习得其他相关理论和自适应学习理念,结合对现有英语词汇学习工具和中国大学生词汇能力石化现状分析,设计了一款针对防止或缓解词汇能力石化问题的自适应词块学习系统,并完成了系统的原型设计。其核心思想如下:一、采用词块教学法培养学习者的词块意识和使用能力,增加词汇语境信息,避免词义直接对等,产生母语负迁移;二、通过学习概念和搭配掌握词块间的关联性,构建并激活学习者的词义网络,增加表达的多样性和准确性;三、设计不同任务复杂度的练习题型,实现对词块从识记到产出的闯关式进阶;四、设置不同学习阶段和教学反馈的自适应规则,让学习内容具有针对性并引导学习者走出词汇使用舒适区,以此来避免用词惰性,改善词汇能力石化问题。

       本研究在54名本科一年级非英语专业的学生中进行了2周的教学实验。其中10人为先导小组,以确定学习材料及实验细节;其余44人随机分为实验组和对照组各22人,前测证明两组成绩不具有显著性。实验组使用自适应词块学习系统,对照组采用传统的无序词块词表学习,两组均保证学习总量和内容完全相同。实验结束后进行后测,并在10天后进行延时测试。另外,还通过问卷调查和访谈对学习效果和满意度进行了补充验证。研究结果表明:自适应词块学习系统能够提高学习者词汇表达的多样性和准确性,并在缓解词汇能力石化上的保持效果和满意度方面都要优于传统词块学习法。

       本研究设计的自适应词块学习系统一方面能够有效缓解词汇能力石化问题,提高词汇表达的多样性和准确性;另一方面丰富了词块教学法的研究成果,对课堂教学和英语词汇学习相关工具的设计具有一定的借鉴意义。

外文摘要:

    Fossilization is one of the main features of Interlanguage, which means that stagnancy or even backwardness occurs in the process of L2 learning. Fossilization may occur in the aspects of pronunciation, vocabulary and grammar, and the prevention and ease of vocabulary fossilization is the core issue of this study. A large number of studies have shown that the lexical approach can ease the problem of vocabulary fossilization, but the traditional lexical pedagogy still has the following three limitations: First, the disordered chunk learning materials ignore the paradigmatic and the syntagmatic relations of the semantic networks, so that learners can hardly build their semantic networks. Second, it emphasizes the memory process but lacks productive exercises. Third, unified learning materials and methods are difficult to meet the individual needs of learners. These problems can aggravate vocabulary fossilization because learners are probably unable to use appropriate and diversified words in an actual context.

       In order to make up for the above deficiencies, in light of the theories of language fossilization, semantic network, the lexical approach, other related theories of second language acquisition and the thought of adaptive learning, this paper has designed a lexical chunk learning system for the ease of vocabulary fossilization based on the analysis of the status quo of learners’ vocabulary fossilization and the existing English vocabulary learning tools. This study completed the prototype design of the system. The core ideas of the system are as follows: First, cultivating learners' lexical chunk awareness and its competence by adopting the lexical approach, so that lexical context information can be enriched and the direct equivalence of vocabulary meanings can be avoided. Second, grasping the interrelationship among lexical chunks by acquiring concepts and collocations, so that the semantic networks can be constructed and activated, and the diversity and accuracy of expressions can be enhanced. Third, realizing the process from memorization of chunks to output by setting different types of task complexity exercises. Fourth, setting adaptive rules for learning process and feedbacks, so as to improve learning pertinence and guide learners to move out of the comfort zone of using vacabulary.

    A two-week teaching experiment was conducted among 54 undergraduate freshmen of non-English majors. Among the participants, 10 were randomly chosen for pilot experiment in order to determine the learning materials and experimental details. The remaining 44 were randomly divided into the experimental group and the control group, 22 participants respectively. The former test proved that the scores of the two groups were not significant statistically. The experimental group acquired chunks by using the adaptive lexical chunk learning system; and the control group used the traditional disordered lexical chunk lists. The total amount of the learning material and the content were all the same to both groups. Post-testing was performed after the end of the experiment, and a delay test was conducted after 10 days of the post-test. In addition, the study applied questionnaires and interviews to acquire the effect and satisfaction of the different learning approach. The research results show that the adaptive lexical chunk learning system is superior to the traditional lexical learning method in terms of the ease of vocabulary fossilization, especially the diversity and accuracy of their expressions. So are the maintenance effect and satisfaction.

    The adaptive lexical chunk learning system designed in this study, on the one hand, can effectively alleviate the problem of vocabulary fossilization, especially the improvement of accuracy and diversity of expressions. On the other hand, it enriches the researches of lexical approach, and has referencing significance to classroom instruction and the design of vocabulary acquisition tools.

分类号:

 TP3    

论文总页数:

 107    

参考文献总数:

 77    

参考文献列表:
[1] 加斯 S,塞林克 L.第二语言习得[M].赵杨,译.北京:北京大学出版社. 2011.
[2] 蔡基刚.关于我国大学英语教学重新定位的思考[J].外语教学与研究, 2010, 42(4): 306-8.
[3] 郑秋萍.心理语言学视角下的二语词汇石化现象分析与防治策略[J].外语研究,2014(06):59-62.
[4] 吴旭东,陈晓庆.中国英语学生课堂环境下词汇能力的发展[J].现代外语,2000(04):349-360.
[5] Tinkham T.The effect of semantic clustering on the learning of second language vocabulary[J]. System,1993,21(3):371-380.
[6] Laufer B. ‘Sequence’and ‘Order’in the Development of L2 Lexis: Some Evidence from Lexical Confusions[J]. Applied Linguistics, 1990, 11(3): 281-296.
[7] Laufer B. The development of passive and active vocabulary in a second language: Same or different?[J]. Applied linguistics, 1998, 19(2): 255-271.
[8] Laufer B, Paribakht T S. The relationship between passive and active vocabularies: Effects of language learning context[J]. Language learning, 1998, 48(3): 365-391.
[9] 崔艳嫣,王同顺.接受性词汇量、产出性词汇量与词汇深度知识的发展路径及其相关性研究[J].现代外语,2006(04):392-400+437-438.
[10] Lewis M. The lexical approach[M]. Hove: Language Teaching Publications, 1993.
[11] Nattinger J R, Decarrico J S. Lexical phrases and language teaching[M]. Oxford University Press, 1992.
[12] 周红云.语言的僵化现象[J].外语界,2003(04):19-26.
[13] Torabian A H, Maros M, Subakir M Y M. Lexical collocational knowledge of Iranian undergraduate learners: implications for receptive & productive performance[J]. Procedia-Social and Behavioral Sciences, 2014, 158: 343-350.
[14] Selinker L. Interlanguage[J]. IRAL-International Review of Applied Linguistics in Language Teaching, 1972, 10(1-4): 209-232.
[15] Richards J C, Schmidt R W. Longman dictionary of language teaching and applied linguistics [M]. Routledge, 2013.
[16] Long M H. Stabilization and Fossilization in Interlanguage Development[A]. In Doughty, Catherine J, Michael H. Long, eds. The handbook of second language acquisition[C]. John Wiley & Sons, 2008, 27: 487-535.
[17] 戴炜栋,牛强.过渡语的石化现象及其教学启示[J].外语研究,1999(02):11-16.
[18] Krashen S. Principles and practice in second language acquisition[J]. 1982.
[19] Han Z H. Fossilization: five central issues[J]. International Journal of Applied Linguistics, 2004, 14(2): 212-242.
[20] Selinker L. Fossilization as simplification ? [J]. 1993: 197-216.
[21] Selinker L, Han Z H. Fossilization: Moving the concept into empirical longitudinal study[A]. In Davis A. Studies in language testing: Experimenting with uncertainty[C].Cambridge University Press, 2001, 27: 276-291.
[22]刘座雄.英语写作词汇能力石化现象探析[J].西南民族大学学报(人文社科版),2007(S1):155-158.
[23] 石永新.大学生英语写作中的词汇石化现象研究[D].吉林大学, 2017.
[24]陈文存.对外语和二语学习者石化现象研究问题的评述[J].外语教学理论与实践,2010(01):89-95+83.
[25] 赵文静.母语汉语学生在汉英同传中的负迁移现象[D].北京外国语大学, 2018.
[26] Meara P. A note on passive vocabulary[J]. Interlanguage studies bulletin (Utrecht), 1990, 6(2): 150-154.
[27] 陈建生.英语词汇教学 “石化” 消解研究[D].西南大学, 2009.
[28] 桂诗春.新编心理语言学[M].上海:上海外语教育出版社.2000.
[29] Schwartz A I, Kroll J F. Bilingual lexical activation in sentence context[J]. Journal of memory and language, 2006, 55(2): 197-212.
[30] 陈玫.从纵聚合和横组合关系看英语写作中的措辞缺陷[J].外语与外语教学,2005(06):32-35.
[31] Singleton D M. Exploring the second language mental lexicon[M]. Cambridge: Cambridge University Press,1999.
[32] 刘绍龙,傅蓓,胡爱梅.不同二语水平者心理词汇表征纵横网络的实证研究[J].解放军外国语学院学报,2012,35(02):57-60+70+128.
[33] 李小撒,王文宇.WordNet与BNC介入下的第二语言心理词汇联系模式实证研究[J].语言科学,2016,15(01):74-84.
[34] Jiang N. Lexical representation and development in a second language[J]. Applied linguistics, 2000, 21(1): 47-77.
[35] Cowie A P. Phraseology: theory, analysis, and applications[M]. Oxford: Clarendon press, 1998.
[36] Becker J D. The phrasal lexicon[A] In Nash-Webber B, Schank R. Proceedings of the 1975 workshop on Theoretical issues in natural language processing [C].Cambridge, Massachusetts, 1975, 60-63.
[37] 杨惠中,卫乃兴.中国学习者英语口语语料库建设与研究[M].上海:上海外语教育出版社.2005.
[38] Lewis M. Implementing the lexical approach: Putting theory into practice[M]. Hove: Language Teaching Publications, 1997.
[39] 周正钟.语块教学法新探—理论, 实证与教学延伸[M].苏州大学出版社. 2014.
[40] 贾知辉.词块概念下的高中英语词汇教学实证研究[D].哈尔滨师范大学, 2016.
[41] 濮建忠.英语词汇教学中的类联接、搭配及词块[J].外语教学与研究,2003(06):438-445+481.
[42] 卫乃兴.中国学习者英语口语语料库初始研究[J].现代外语,2004(02):140-149+216-217.
[43] Bychkovska T, Lee J J. At the same time: Lexical bundles in L1 and L2 university student argumentative writing[J]. Journal of English for Academic Purposes, 2017, 30: 38-52.
[44] Lu X, Deng J. With the rapid development: A contrastive analysis of lexical bundles in dissertation abstracts by Chinese and L1 English doctoral students[J]. Journal of English for Academic Purposes, 2019,(39)21-36.
[45] 郭小宁.中国英语专业学生预制词块鉴别能力研究[D].东北师范大学, 2009.
[46] 丁言仁,戚焱.词块运用与英语口语和写作水平的相关性研究[J].解放军外国语学院学报,2005(03):49-53.
[47] Krashen S D. Principles and practice in second language acquisition[M]. New York, Oxford: Pergamon,1982.
[48] Swain M. Communicative competence: Some roles of comprehensible input and comprehensible output in its development[J]. Input in second language acquisition, 1985, 15: 165-179.
[49] 何花.非英语专业研究生英语输出中的“注意”培训研究[D].上海外国语大学,2014.
[50] 冯纪元,黄姣.语言输出活动对语言形式习得的影响[J].现代外语,2004(02):195-200+220.
[51] 戴运财,戴炜栋.从输入到输出的习得过程及其心理机制分析[J].外语界,2010(01):23-30+46.
[52] 王初明.外语写长法[J].中国外语,2005(01):45-49.
[53] Hulstijn J H, Laufer B. Some empirical evidence for the involvement load hypothesis in vocabulary acquisition[J]. Language learning, 2001, 51(3): 539-558.
[54] 孔繁霞,王歆.任务模式与类型对词汇附带习得的影响研究[J].外语界,2014(06):21-29.
[55] 魏梅,王立非.任务类型与频次因素对大学生英语惯用短语学习的影响——对投入量假设的再考察[J].现代外语,2011,34(04):372-380.
[56] Vigil N A, Oller J W. Rule Fossilization: A Tentative Model[J]. Language learning, 1976, 26(2): 281-295.
[57] Truscott J. The case against grammar correction in L2 writing classes[J]. Language learning, 1996, 46(2): 327-369.
[58] Han Y, Hyland F. Academic emotions in written corrective feedback situations[J]. Journal of English for Academic Purposes, 2019, 38: 1-13.
[59] Rassaei E. Corrective feedback, learners' perceptions, and second language development[J]. System, 2013, 41(2): 472-483.
[60] Bitchener J. Evidence in support of written corrective feedback[J]. Journal of second language writing, 2008, 17(2): 102-118.
[61] 蒋景阳.英语作为外语教学的课堂中非刻意负反馈作用的研究[D].上海外国语大学, 2010.
[62] Brusilovsky P . Methods and techniques of adaptive hypermedia[J]. User Modeling and User-Adapted Interaction, 1996, 6(2-3):87-129.
[63] Weber G, Brusilovsky P. ELM-ART: An adaptive versatile system for Web-based instruction[J]. International Journal of Artificial Intelligence in Education (IJAIED), 2001, 12: 351-384.
[64] Alshammari M. Adaptation based on learning style and knowledge level in e-learning systems [D].University of Birmingham, 2016.
[65] 廖轶.面向基础教育的自适应学习服务系统研究与应用[D].北京交通大学, 2017.
[66] 陆宏,赵艳平.高中英语词汇自适应学习系统的研制[J].现代教育技术,2014,24(11):47-52.
[67] Li M, Ogata H, Hou B, et al. Development of adaptive vocabulary learning via mobile phone e- mail[C]//2010 6th IEEE International Conference on Wireless, Mobile, and Ubiquitous Technologies in Education. IEEE, 2010: 34-41.
[68] Jung J, Graf S. An approach for personalized web-based vocabulary learning through word association games[C]//2008 International Symposium on Applications and the Internet. IEEE, 2008: 325-328.
[69] Lu M. Effectiveness of vocabulary learning via mobile phone[J]. Journal of computer assisted learning, 2008, 24(6): 515-525.
[70] 吕京.基于自适应模式的英语阅读教学研究[D].北京大学, 2015.
[71] 林毅君.基于自适应学习模式的英语从句语法教学研究[D].北京大学, 2015.
[72] 徐亮.基于自适应学习模式的大学英语产出性词汇教学研究[D].北京大学, 2015.
[73] 阙颖.面向自适应教学的英语口语资源加工方法的设计与实现[D].北京大学, 2017.
[74] 宋凌云.基于自适应学习模式的高中英语听力教学研究[D].北京大学, 2016.
[75] Huckin T, Bloch J. Strategies for inferring word-meanings in context: a cognitive model [A]. In Haynes M, Huckin T, Coady J. Second language reading and vocabulary learning[C]. Albex Publishing Corporation, 1993, 153-178
[76] 杨世登.英语学习者产出词汇的发展模式[J].外国语言文学,2007(04):254-259+288.
[77] Gardner R C, Lalonde R N, MacPherson J. Social factors in second language attrition[J]. Language learning, 1985, 35(4): 519-540.


公开日期:

 2019-06-19    

海外汉学著作精准回译策略研究——以《中国武术:从古代到21世纪》为例.钱康

链接

题名:

 海外汉学著作精准回译策略研究——以《中国武术:从古代到21世纪》为例    

姓名:

 钱康    

学号:

 1601210677    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 李博婷    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-27    

外文题名:

 Strategies of Accurate Back Translation of Overseas Sinological Works——Taking Chinese Martial Arts: From Antiquity to the Twenty-First Century as an Example    

关键词:

 海外汉学著作 精准回译 武术翻译    

外文关键词:

 Overseas Sinological Works Accurate Back Translation Martial Arts Translation    

论文摘要:

  自上世纪八十年代以来,海外对中国的研究不断加强,出现了越来越多的汉学著作,中国学界也开始重视起这些著作的译介,这些译介在中国的海外汉学研究中扮演了重要角色。在海外汉学著作的翻译过程中,回译问题不可避免,由于海外汉学著作多为学术类著作,行文严谨,措辞谨慎,这也对译者的回译提出了更高的要求,需达到精准回译。本次翻译项目的源文本来自于《中国武术:从古代到21世纪》(Chinese Martial Arts: From Antiquity to the Twenty-First Century),该书介绍了中国武术的发展史,是一本典型的汉学著作,作者为美国著名历史学家龙佩(Peter Lorge)。

   本文首先阐述了本次研究的背景与意义,介绍了本次翻译的书籍以及所用到的翻译工具,然后从“海外汉学”,“回译”和“中国武术的翻译”三个角度进行了文献综述。在第三章中,笔者基于《中国武术》选译章节的翻译,总结出该书中出现的三大回译现象,即引文回译、词汇回译、以及原文错误之回译。在引文回译中,笔者将引用类型细分成“直接引用”和“间接引用”,提出不同引用类型下,引文的回译处理方式,其中对于“间接引用”的引文进行回译时,还需留意“古今异义”现象的出现;在词汇回译中,笔者将词汇分为“人名”,“武器名”和“武术相关术语”三大类,分别就这三类词汇出现的回译问题进行了探讨;在原文错误之回译中,笔者将错误分为“名称错误”和“史实描述错误”两类,就错误性质进行了定性,并对这些错误的回译方式给出了建议。

  最后笔者基于本次翻译实践的过程,提出“巧用四字格”、“合理减译”、“归化为主”以及“恪守读者视角原则”四大精准回译策略。本次研究是对武术历史类海外汉学著作精准回译策略研究的初试,以期为同类著作的翻译提供一些参考。

外文摘要:

  During the process of translating overseas sinological works into Chinese, back translation is inevitable. Since overseas sinological works are mostly academic works written rigorously and cautiously, they set a high demand on the translator’s skills and abilities. Accurate back translation is therefore needed. The present translation project is based on Chinese Martial Arts: From Antiquity to the Twenty-First Century, authored by Peter Lorge, which discusses the history of Chinese martial arts. 

  This paper summarizes three major problems in the back translation of overseas sinological works, namely citations, special nouns, and errors in the source text. For the back translation of citations, the paper subdivides them into “direct citations” and “indirect citations”, and proposes different methods of back translation. With special nouns, this paper divides them into three categories: “names of people”, “names of weapons” and “martial-arts-related terms”, and discusses their back translation of them. As for errors in the source text, the paper divides them into two types as “errors of names” and “errors of historical description”, and advises on how to deal with them in the back translation.

  Based on this translation practice, the paper then puts forward four strategies for accurate back translation, namely “using Chinese four-character structure”, “omitting known information to the target audience”, “domesticating translation as the mainstay” and “observing the target-reader perspective”. In conclusion, this paper points out that for accurate back translation of Chinese martial arts, extensive bibliographic search is a prerequisite and careful contextualization of the object of translation is necessary.

分类号:

 H059    

论文总页数:

 191    

参考文献总数:

 43    

参考文献列表:
程裕祯. 关于海外汉学研究[J]. 中国文化研究, 1997(2):118-121.
党晟. 往而复来——漫议西方汉学著作的翻译[J]. 读书, 2018(09):157-164.
丁红艳, 陆志国.也谈文学翻译的原则[J]. 延安教育学院学报, 2004(01):68-70.
方骏. 中国海外汉学研究现状之管见[J]. 国际汉学, 2000(02):9-16.
方梦之. 中国译学大辞典[Z]. 上海外语教育出版社, 2011:97.
冯庆华, 李美. 文体翻译论[M]. 上海外语教育出版社, 2001.
郭沫若. 甲骨文合集[M]. 中华书局, 1999:4541
韩丹. 我国古代东北民族的射柳活动考[J]. 哈尔滨体育学院学报, 2004(1):1-3.
何一民. 海外“中国学”与中国“中国学”[J]. 四川师范大学学报(社会科学版), 2011, 38(01):109-114.
贺显斌. 回译的类型、特点与运用方法[J]. 中国科技翻译, 2002, 15(4):45-47.
胡厚宣. 甲骨文合集释文一[M]. 中国社会科学出版社, 1999:1803.
胡厚宣. 甲骨续存补编[M]. 天津古籍出版社, 1996.
季金珂. 浅谈武术类文本的回译策略[J]. 俄语学习, 2017(05):54-60.
焦丹. 论“一带一路”背景下的中华武术文化翻译及国际传播[J]. 翻译界, 2017:81.
乐黛云. 多元文化发展中的问题及文学可能作出的贡献[J]. 中国文化研究, 2001(1):9-15.
李宁. 英译汉中“四字格”美学价值试析[J]. 新疆大学学报(哲学•人文社会科学汉文版), 2003(s1):161-163.
李长栓. 非文学翻译[M]. 外语教学与研究出版社, 2009:91
卢安. 武术类英文版图书国外发行现状研究与启示[J]. 内蒙古农业大学学报(社会科学版), 2014, 16(3).
鲁迅. 鲁迅全集·且介亭杂文二集[M]. 人民文学出版社, 1981:61-63.
罗安宪. “学而优则仕”的历史流变[J]. 中国社会导刊, 2006(6):14-15.
罗永洲. 中国武术英译现状与对策[J]. 外语教学理论与实践, 2008(4):58-63.
吕洁. 论英译汉中汉语四字格的使用[J]. 当代教师教育, 2002, 19(4):73-76.
钱钟书. 林纾的翻译[J]. 中国翻译, 1985(11):2-10.
万雪梅. 试论汉学翻译[J]. 南京师范大学文学院学报, 2012(1):84-88.
王宏印, 江慧敏. 京华旧事,译坛烟云——Moment in Peking的异语创作与无根回译[J]. 外语与外语教学, 2012(2):65-69.
王宪明. 返朴归真最是信──由几处经典引文回译所想到的[J]. 中国翻译, 1994(4):72-76.
王正良. 回译研究[M]. 大连海事大学出版社, 2007.
王正胜. 回译研究的创新之作——《回译研究》介评[J]. 外语教育, 2009, 9(00):167-170.
谢应喜. 武术翻译初探[J]. 中国翻译, 2008(1):61-64.
徐海亮. 武术翻译四项原则[J]. 中华武术, 2005(1):24-25.
杨伯峻. 论语译注.大字本[M]. 中华书局, 2015.
叶红卫, 刘金龙. 近30年来汉学文献在国内的翻译与出版[J]. 出版发行研究, 2015(5):61-63.
张博. 反义类比构词中的语义不对应及其成因[J]. 语言教学与研究, 2007(1):43-51.
张芳. 汉学论著翻译问题论析——以伊沛霞《剑桥插图中国史》为例[J]. 江苏教育学院学报:社会科学版, 2014(7):93-97.
张西平. 西方汉学研究导论[M]. 学苑出版社, 2007:25.
指文烽火工作室. 中国古代实战兵器图鉴[M]. 中国长安出版社, 2015:66.
周琳. 古今异义成语语义转移的主要类型及成因[J]. 现代语文(语言研究版), 2014(1):42-46.
周庆杰. 杨式太极拳翻译研究[J]. 中国体育科技, 2004, 40(5).
朱明胜. 文化词的翻译——以“麻花”的英译为例[J]. 译林(学术版), 2012(6):180-185.
Brislin, R. W. Back-translation for cross-cultural research[J]. Journal of cross-cultural psychology, 1970.
Mark Shuttleworth & Moria Cowie. 翻译研究词典[Z]. 外语教学与研究出版社, 2005.
Newmark. P. Paragraphs on Translation [M]. Cleveadom: Multilingual Matters Ltd, 1993.
Toury. G. In Search of A Theory of Translation [M]. Tel Aviv: Porter Institute for Poetics and Se miotics, 1980.

公开日期:

 2019-06-14    

基于语料库方法研究G.K.切斯特顿的反犹问题.窦蕾

链接

题名:

 基于语料库方法研究G.K.切斯特顿的反犹问题    

姓名:

 窦蕾    

学号:

 1601210504    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 李博婷    

导师1单位:

 北京大学软件与微电子学院    

论文答辩日期:

 2019-05-27    

外文题名:

 Corpus-based Approaches to Anti-Semitism of G.K. Chesterton    

关键词:

 G.K. 切斯特顿 基于语料库 Cohen’s d 反犹主义    

外文关键词:

 G.K. Chesterton Corpus-based Approaches Cohen’s d Anti-Semitism    

论文摘要:

吉尔伯特·基思·切斯特顿是20世纪初的英国作家和记者。他生前被指控为反犹主义,如今他是否反犹仍是有争议的问题。笔者使用语料库方法研究两个问题:1、切斯特顿的犹太观点有何显著的特点?2、他的犹太观点特点是否同特定的思维模式相关? 研究步骤如下:1、建立切斯特顿几乎全部作品语料库和同时代英国英语参考语料库,使用POS和USAS标注系统进行标注。2、收集一组犹太主题词汇,在切斯特顿语料库中研究它们的搭配,分析出切斯特顿犹太观点的特点。3、笔者认为切斯特顿在不同时期作品中广泛分布的语言特征有可能是他的思维模式的语言表征,因而笔者依据年份信息将切斯特顿语料库分组,同时也将参考语料库分组,使用cohen’s d方法计算两组语料语言特征的效应量差别,并选择cohen’s d值大于0.8的语言特征作为具有关键性的语言特征,将它们视作潜在的切斯特顿思维模式的语言表征。4、筛选出具有关键性的语言特征和搭配的重合部分,并分析它们在切斯特顿语料库中的用法,考察其用法是否与同犹太主题词汇共现时的用法相通,以此揭示切斯特顿思维模式与犹太观点的联系。 通过搭配分析,笔者发现:切斯特顿对犹太人在西方世界的存在、犹太人与金钱的关系、犹太人的人际关系都给予了负面的评价;他常常将犹太人分为不同的类型,并对这些类型进行两极化的评价;他多次将犹太教与基督教对立起来;他对犹太人的“身份”给予了一定关注。结合关键性分析,笔者发现切斯特顿的犹太观点存在背后思维模式的支撑:他将基督教作为理解其它其他宗教的参照,因而会将犹太教与基督教对立;他常常发现并展示事物的矛盾之处,而他对犹太人外在身份与实质的矛盾的关注吻合这一思维模式。

外文摘要:

Gilbert Keith Chesterton is a British writer and journalist in the early 20th century. He is accused of Anti-Semitism when he is alive. Whether he is an Anti-Semitist is still a controversial issue today. This study uses corpus lingusitics method and tries to answer two questions: 1.What are the most prominent features of his views of Jewishness? 2. Whether those features are related to his idealogical frame of mind as a whole.

The research steps are as follows: 1. Build a corpus of most of Chesterton’s work, as well as a reference corpus of British English roughly of the same era, and tag the corpuses with POS and USAS annotation systems. 2. Collect a set of words of the Jewish theme, and calculate the collocation of lemma and semantic annotation in the Chesterton corpus. Obtain the prominent features of Chesterton’s view of Jewishness through collocation analysis. The author argues that the key linguistic features widely distributed among Chesterton’s works in different times may be the linguistic representation of his general frame of mind. Therefore, this study divides the two corpuses into two groups of texts, using Cohen's d to calculate key linguistic features. According to the benchmark, those linguistics features with Cohen’s d larger than 0.8 are selected as key features and potential representations of his general frame of mind.Then the author filters out the coincidental part of the key features and collocations, and analyzes whether their usage in the Chesterton corpus in general has relations with their usage in collocations with words of the Jewish theme, in order to reveal the connection Chesterton’s view of Jewishness and his general frame of mind.

             Through collocation analysis the author draws these conclusions: 1、Chesterton has negative opinions on the Jewish people about their existence in the Western world, their relationship with money, their interpersonal relationship with other people; 2. he often divides the Jewish into different types with only negative or positive opinions. 3. he sets up Judaism as the opposite of Christianity; 4. He is concerned about the Jewish identity. When combining those findings with key features analysis, the author finds that Chesterton's Jewish view is supported by his general frame of mind: he uses Christianity as a reference for understanding other religions, and thus pits Judaism against Christianity; he often finds and displays contradictions in things. Therefore, his concern about the contradiction between Jewish external identity and substance is consistent with this mode of thinking.

分类号:

 I56    

论文总页数:

 60    

参考文献总数:

 50    

参考文献列表:
[1] Dean Rapp. The Jewish response to GK Chesterton's antisemitism, 1911–33[J]. 1990.
[2] Owen-Dudley Edwards. Chesterton and Tribalism[J]. The Chesterton Review, 1979, 6(1): 33-69.
[3] Simon Mayers. Chesterton’s Jews: Stereotypes and Caricatures in the Literature and Journalism of G. K. Chesterton[M]. CreateSpace Independent Publishing Platform, 2013: 132.
[4] Ann Farmer. Chesterton: Religion, anti-Semitism and the Politics of the Underdog[J]. The Chesterton Review, 2008, 34(1/2): 163-186.
[5] G. K. Chesterton's Works on the Web. 2019.
[6] Leo-A Hetzler. Chesterton's Political Views, 1892-1914, with Comments on Chesterton and Anti-Semitism: to be continued[J]. The Chesterton Review, 1981, 7(2): 119-138.
[7] Hitler branded a barbarian. 1933: 14.
[8] Anthony Julius. Trials of the Diaspora: A History of Anti-semitism in England. Oxford University Press, 2012: 242-347.
[9] Joyce Eisenberg, Scolnic Ellen. Dictionary of Jewish Words: A JPS Guide[M]. Jewish Publication Society, 2010.
[10] Steven Beller. Antisemitism: A very short introduction[M]. Oxford University Press, USA, 2015.
[11] Todd-M Endelman. Native Jews and Foreign Jews(1870-1914). Berkeley and Los Angeles, California: University of California Press, 2002: 155.
[12] William Oddie. Reform,revolution,and the religion of mankind. New York: 2008: 80.
[13] Fred Black. A Note on Chesterton and Anti-Semitism[J]. The Chesterton Review, 1977.
[14] Kevin-L Morris. Reflections on Chesterton's Zionism[J]. The Chesterton Review, 1987, 13(2): 163-176.
[15] Bryan Cheyette. An overwhelming question: Jewish stereotyping in English fiction and society, 1875-1914. University of Sheffield, 1986.
[16] Bryan Cheyette. Constructions of'the Jew'in English Literature and Society: Racial Representations, 1875-1945. Cambridge University Press, 1995: 179-205.
[17] Anna Vaninskaya. ‘My mother, drunk or sober’: GK Chesterton and patriotic anti-imperialism[J]. History of European Ideas, 2008, 34(4): 535-547.
[18] Mike Scott. PC analysis of key words—and key key words[J]. System, 1997, 25(2): 233-245.
[19] Marina Bondi, Scott Mike. Keyness in texts[M]. John Benjamins Publishing, 2010.
[20] Paul Baker, Gabrielatos Costas, Khosravinik Majid, et al. A useful methodological synergy? Combining critical discourse analysis and corpus linguistics to examine discourses of refugees and asylum seekers in the UK press[J]. Discourse & society, 2008, 19(3): 273-306.
[21] Vaclav Brezina. Statistics in Corpus Linguistics: A Practical Guide[M]. Cambridge: Cambridge University Press, 2018.
[22] Paul-Edward Rayson. Computational tools and methods for corpus compilation and analysis[J]. 2015.
[23] Bill Louw. Irony in the text or insincerity in the writer? The diagnostic potential of semantic prosodies[J]. Text and technology: In honour of John Sinclair, 1993, 157176.
[24] Michael Stubbs. Collocations and semantic profiles: On the cause of the trouble with quantitative studies[J]. Functions of language, 1995, 2(1): 23-55.
[25] 朱一凡,胡开宝. “被” 字句的语义趋向与语义韵——基于翻译与原创新闻语料库的对比研究. 2014.
[26] Peter Stockwell, Mahlberg Michaela. Mind-modelling with corpus stylistics in David Copperfield[J]. Language and Literature, 2015, 24(2): 129-147.
[27] Rocío Montoro. The creative use of absences[J]. International Journal of Corpus Linguistics, 2018, 23(3): 279-310.
[28] David-L Hoover. Corpus stylistics, stylometry, and the styles of Henry James[J]. Style, 2007, 41(2): 174-203.
[29] Fulya Erdentuğ, Musayeva Vefalı Gülşen. What is “old” and “past” in New Age discourse? A qualitative analysis of corpus evidence[J]. Discourse, Context & Media, 2018, 2485-91.
[30] Shuki-J Cohen, Holt Thomas-J, Chermak Steven-M, et al. Invisible empire of hate: gender differences in the Ku Klux Klan's online justifications for violence[J]. Violence and gender, 2018, 5(4): 209-225.
[31] Sin Yan Eureka Ho, Crosthwaite Peter. Exploring stance in the manifestos of 3 candidates for the Hong Kong Chief Executive election 2017: Combining CDA and corpus-like insights[J]. Discourse & Society, 2018, 29(6): 629-654.
[32] Laura-A Cariola. A Corpus‐based Psychodynamic Analysis of Body Boundary Imagery in Hitler's Mein Kampf[J]. International Journal of Applied Psychoanalytic Studies, 2014, 11(4): 318-338.
[33] Marcus Bridle. Male blues lyrics 1920 to 1965: A corpus based analysis[J]. Language and Literature, 2018, 27(1): 21-37.
[34] Hendrik De Smet, Flach Susanne, Tyrkkö Jukka, et al. The corpus of Late Modern English (CLMET), version 3.1: Improved tokenization and linguistic annotation[J]. KU Leuven, FU Berlin, U Tampere, RU Bochum, 2015.
[35] Vaclav Brezina, McEnery Tony, Wattam Stephen. Collocations in context: A new perspective on collocation networks[J]. International Journal of Corpus Linguistics, 2015, 20(2): 139-173.
[36] Scott Piao, Bianchi Francesca, Dayrell Carmen, et al. Development of the multilingual semantic annotation system[A]//2015: 1268-1274.
[37] Dawn Archer, Wilson Andrew, Rayson Paul. Introduction to the USAS category system[J]. Benedict project report, October 2002, 2002.
[38] Paul Rayson. Matrix: A statistical method and software tool for linguistic analysis through corpus comparison. Lancaster University, 2003.
[39] Dana Gablasova, Brezina Vaclav, McEnery Tony. Collocations in corpus‐based language learning research: Identifying, comparing, and interpreting the evidence[J]. Language learning, 2017, 67(S1): 155-179.
[40] Jacob Cohen. Statistical power analysis for the behavioral sciences. Routledge, 1988.
[41] Daniel Lakens. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs[J]. Frontiers in Psychology, 2013, 4863.
[42] William-J Crawford, McDonough Kim, Brun-Mercer Nicole. Identifying Linguistic Markers of Collaboration in Second Language Peer Interaction: A Lexico-grammatical Approach[J]. TESOL Quarterly, 2019, 53(1): 180-207.
[43] Norman 所罗门 Solomon,文学王广州. 犹太人与犹太教: a very short introduction[M]. 南京: 译林出版社, 2014.
[44] Paul Baker, Levon Erez. Picking the right cherries? A comparison of corpus-based and qualitative analyses of news articles about masculinity[J]. Discourse & Communication, 2015, 9(2): 221-236.
[45] Patrick Hanks, Hardcastle Kate, Hodges Flavia. A dictionary of first names[M]. New York;Oxford;
: Oxford University Press, 2006.
[46] Richard-Coates-Peter-McClure Patrick Hanks. The Oxford Dictionary of Family Names in Britain and Ireland[M]. Great Britain., Oxford: Oxford University Press, 2016.
[47] Aidan Nichols. GK Chesterton, Theologian[M]. Sophia Institute Press, 2009.
[48] Oxford-English Dictionary. "call, n.". [J].
[49] Miles Schmitt. THE ESSAY STYLE OF CHESTERTON[J]. Franciscan Studies, 1943, 3(1): 73-83.
[50] Hugh Kenner. Paradox in Chesterton[M]. New York: Sheed & Ward, 1947.
公开日期:

 2019-06-25    

2019-05-24

英文汉学著作的汉译: 回译和变译.房一品

链接

题名:

 英文汉学著作的汉译: 回译和变译    

姓名:

 房一品    

学号:

 1701212749    

论文语种:

 chi    

专业:

 专业学 - 翻译硕士 - 英语笔译    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 翻译硕士专业学位    

培养单位:

 北京大学    

院系:

 外国语学院    

导师1姓名:

 朱源    

导师1单位:

  外国语学院    

论文答辩日期:

 2019-05-24    

外文题名:

 English to Chinese Translation of Sinology Publications: Back-translation and Translation Variation    

关键词:

 汉学 早期中国哲学 回译 变译    

外文关键词:

 Chinese Studies early Chinese philosophy back-translation translation variation    

论文摘要:

本翻译项目源文本取自《早期中国哲学中的情感元素》一书的部分章节。该书是多伦多大学文理学院东亚研究系副教授居里·维拉格(Curie Virág)所著,于2017年由牛津大学出版社在美国首次出版。该书围绕“情感”在早期中国思想家的理论中的地位展开研究,追溯了早期中国哲学概念的谱系, 并考察了它们在古代中国伦理、政治和文化价值观形成中的关键作用。该书分为六个章节,本翻译项目选取了其中的前言、结论和前三章进行翻译,涉及内容包括:孔子《论语》中的情感元素和完整自我、《墨子》对人类社会的重新定义、《道德经》中宇宙欲望和人的能动性。居里·维拉格在哈佛大学东亚语言与文化系取得了博士学位;她的主要研究方向是前现代时期(战国至公元十二世纪)的中国哲学及思想史,已经出版三部学术著作并发表了三十多篇学术论文。作为一部以英语撰写的学术著作,本翻译项目选取的文本具备下列特点:语言风格正式、专业词汇多、名词化场景多、被动句和复合句多。此外,作为一部海外汉学著作,此书涉及大量用英文改译的汉学典故、文献名称和人名头衔名,给翻译造成了一定困难和挑战。

外文摘要:

This translation project is based on The Emotions in Early Chinese Philosophy. Written by Curie Virág and published by Oxford University Press, New York in 2017, this book focuses on the significance of emotions in the theories of early Chinese philosophers, traces the genealogy of these early Chinese philosophical conceptions and examines their crucial role in the formation of ethical, political and cultural values in China. The book consists of six chapters, from which the first three chapters are taken as the source text of this project as well as the part of introduction and conclusion. It gives deep insights into emotions and the integrated self in the analects of Confucius, redefinitions of the human community in Mozi, and the cosmic Desire and human agency in the Daodejing. The author Curie Virág received her Ph.D. degree at the Department of East Asian Languages and Civilizations at Harvard University. She works in the fields of premodern Chinese philosophy and intellectual history (Warring States to 12th century) and has published three academic books and more than thirty papers. As an academic work written in English, the text selected in this translation project has the following characteristics: formal language style, richness in philosophical terms, nominalization, passive sentences and compound sentences. In addition, as an overseas work on Chinese Studies, the source text involves a large number of Chinese allusions, titles of references and names of sinologists, which poses lots of difficulties and challenges for translation.

分类号:

 H31    

论文总页数:

 14    

参考文献总数:

 17    

参考文献列表:
黄忠廉:《变译理论:一种全新的翻译理论》,载于《国外外语教学》,2002年第1期。
弘学:《禅林宝训》讲释,成都:巴蜀书社,2006。
季进,邓楚,许路:《众声喧哗的中国文学海外传播——季进教授访谈录》,载于《国际汉学》,2016年第2期。
焦鹏帅:《变译研究二十年:哲思、发展和国际化》,载于《外语与翻译》,2018年第2期。
刘家润:《晦涩词句中的科学观——关于“老子”第一章的解读》,国学网,2006年12月21日。
南怀瑾:《老子他说》续集。北京:东方出版社,2010。
孙彬:《中国传统哲学概念“理”与西周哲学译名之研究》,载于《哲学与文化研究》,2015年第2期。
谭载喜 主译:《翻译研究辞典》,Mark Shuttleworth, Moira Cowie著。北京:外语教学与研究出版社,2005。
王宏印:《从“异语写作”到“无本回译”——关于创作与翻译的理论思考》,载于《上海翻译》,2016年第3期。
王楠:《对汉学论著翻译规范的探讨》,载于《史学月刊》,2002年第4期。
吴万伟:《英汉学术翻译中的回译问题》,载于中国英汉语比较研究会《中国英汉语比较研究会第十次全国学术研讨会暨2012英汉语比较与翻译研究国际学术研讨会会议日程和摘要汇编》,2002。
许峰:《海外中国学研究的发展前瞻——北京联合大学海外中国学研究中心成立大会暨学术研讨会述要》,载于《中共党史研究》,2012年第11期。
叶红卫:《海外英文汉学论著翻译研究》,载于《上海翻译》,2016年第4期。
赵旭东 译:《帝国的隐喻:中国民间宗教》,Stephan Feuchtwang著。南京:江苏人民出版社,2009。

Virág, Curie. The Emotions in Early Chinese Philosophy. Oxford University Press, 2017.
Craig, Edward, ed. Routledge Encyclopedia of Philosophy: Questions to Sociobiology. Vol. 8. Taylor & Francis, 1998.
Heim, Michael Henry, and Andrzej W. Tymowski. Guideline for the Translation of Social Science Texts. American Council of Learned Societies, 2006.
公开日期:

 2019-06-13    

《译者的取与舍——简析英译汉的异化归化策略》.江皓如

链接

题名:

 《译者的取与舍——简析英译汉的异化归化策略》    

姓名:

 江皓如    

学号:

 1701212752    

论文语种:

 chi    

专业:

 专业学 - 翻译硕士 - 英语笔译    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 翻译硕士专业学位    

培养单位:

 北京大学    

院系:

 外国语学院    

导师1姓名:

 朱源    

导师1单位:

 中国人民大学外国语学院    

论文答辩日期:

 2019-05-24    

关键词:

 历史类文本 异化归化策略 词语 句式 修辞 思维逻辑    

论文摘要:

     《欧洲海外殖民帝国,1879–1999——一段短暂的历史》是一本历史题材类著作。作者探讨了 19 世纪末至 20 世纪末这一百年间欧洲海外殖民帝国的发展动力和历史轨迹,以及这段交织着欲望与血泪的殖民史对当今世界的种种影响。出于对世界历史的热爱和对历史的反思,笔者选择本书作为翻译实践的对象。在本篇报告中,笔者按照译前准备、译中处理和译后处理的顺序,先是简要回顾了国内外对于历史类文本英译汉的研究情况,再对作者选取的异化归化理论进行大致的介绍,并结合翻译实例,从词语、句式、修辞和思维逻辑四个方面分析得出结论——翻译实践中异化与归化并存,缺一不可,从而回答了译者对原文和译文如何取舍的问题。最后,笔者探讨了这两种翻译策略的研究意义,进一步思考了翻译理论对翻译实践的指导作用以及译者如何提升自身专业素质的问题。笔者希望借此番探讨引起广大翻译爱好者和从业者的共鸣。

分类号:

 H059    

论文总页数:

 277    

参考文献总数:

 10    

参考文献列表:
韩烨:《释意理论观下的历史类读物翻译策略》,载于《明日风尚》,2017年第3期。
胡开宝、谢丽欣:《论主体间性与英汉词典历史文本翻译》,载于《宁夏大学学报(人文社会科学版)》,2005年第6期。
刘蓉:《从英汉民族思维差异看英汉语序》,载于《读与写杂志》,2009年第6卷第5期。
刘婷玉:《浅析历史题材类文本的翻译策略——文本类型理论视角》,载于《海外英语(上)》,2017年第7期。
刘婷玉:《浅析历史题材类文本英语被动语态的翻译策略——从主语和主题是否一致视角》,载于《海外英语(上)》,2017年第7期。
Newmark, P. Approaches to Translation. New York: Prentice Hall International (UK) Ltd, 1988.
Nida, Eugene A. Toward a Science of Translating: With Special Reference to Principles and Procedures Involved in Bible Translating. Boston: Brill, 2003.
Reiss, K. Translation Criticism: The Potentials and Limitations. (Translated by Erroll, F.R.) . Manchester: St Jerome Publishing, 1997/2000. (上海教育出版社,2004)
Spears, Richard A. McGraw-Hill Dictionary of American Idioms and Phrasal Verbs. New York: McGraw-Hill, 2002.
Venuti, Lawrence. The Translator’s Invisibility. Shanghai: Shanghai Foreign Language Education Press, 2009.

公开日期:

 2019-06-25    

2019-05-23

汉语“V-的”结构中的“的”及其锚定功能.叶永青

链接

题名:

 汉语“V-的”结构中的“的”及其锚定功能    

作者:

 叶永青    

学号:

 1601213231    

语种:

 eng    

专业:

 文学 - 外国语言文学 - 外国语言学及应用语言学    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 文学硕士    

培养单位:

 北京大学    

院系:

 外国语学院    

导师姓名:

 何卫    

导师单位:

  外国语学院    

答辩日期:

 2019-05-23    

题目(外文):

 The Anchoring Function of de in V-de Construction in Mandarin Chinese    

关键字(中文):

 “V-的”结构 时态锚定 生成语法    

关键字(外文):

 Verbal de tense anchoring generative linguistics    

文摘:

大量的文献探究了生成语言学视角下的时态表达,但是前人对具体语言中的时态系统仍然没有明确定论。中文通常被认为缺乏显性的形态变位,因此学术界对其时态的表达机制有众多的讨论。本文旨在研究中文“V-的”结构中“的” 的时态功能和锚定机制。前人文献里讨论了与“V-的”相似的结构,如分裂句、事态句、焦点结构等等。本文试图将中文“V-的”结构与其它类似形式区分开来,并表明“V-的”结构表现出特殊的句法特性。中文“V-的”结构并不应该被视为和前人讨论的“是…的”等句内部结构一致,也不应被笼统归为是同一结构的不同变体。众多的研究观察表明中文“V-的”句有两个主要的句法表现:其一、时态上,中文“V-的”结构倾向于得到过去时的解读,且这种解读是由功能词“的”带来的。其二、中文“V-的”句与表示将来的时态标记,中文体助词“了”、“着”、“过”,以及句末“了”在句法上并不兼容。在此基础上,本研究对结构的讨论需要回答两个与中文“V-的”结构的句法属性相关的研究问题:第一、这个结构中的“的”如何产生表达偏向过去时的、非未来的时态解读?第二、为什么这个结构中的“的” 在句法上不允许与上述提到的元素共现?本研究在生成句法的视野和最简方案的框架下提出了一个解释,将“的” 视为有指示性质的词项,含有[+指示性]的特征,其功能为锚定事态。锚定的功能在中文的Dº和Tº上同样实现为词素“的”。“的”在“V-的”结构的句法生成过程中其位置从AspPº移动到Tº,最终落脚到Cº。这种论证的原因在理论上有Marantz(2013)的语境异义性(contextual allosemy)概念的支持,并在实践层面可以解释上述提到的中文“V-的”结构的众多句法表现。

本文结构上首先简要介绍了中文“V-的”结构的一系列句法表现。文章第一章回顾了以往研究提出的关于时态机制的相关文献。第二章综述探究了在不同理论视角下前人研究对和中文“V-的”结构类似的不同形式的结构的分析,如焦点句、分裂句等。第三章讨论了中文“V-的”句的形式和句法表现,将其和其它的形式区分开来,明确定义了什么是本文讨论“V-的”结构,并进一步展开陈述本文要讨论的问题。文章第四章对词项“的” 在中文“V-的”结构中的句法结构和语义属性进行了解释。本研究旨在考察、描述、分析中文“V-的”结构的时态特性并从句法的层面提出解决方案。本研究的贡献在于帮助未来的研究区分与中文“V-的”结构相似的众多结构,并对后人有关中文“的”、焦点结构、信息结构等研究提供思路。本研究同时也为比较跨语言的时态表达和时态锚定的机制提供了一个视角,为后人讨论汉语时态的系统和时态锚定机制提供了话题。

文摘(外文):

tense expression has been extensively researched but not adequatelyattested under the generative linguistic paradigm. theories of tense derivation of specific languages abound. mandarin chinese (henceforth chinese)is generally considered to lack overt morphological tense inflection, thus has been the subject of much scholarly debate of various tense related issues.  this paper sets out to investigate the tense interpretation of verbal dein the v-de construction in chinese. v-deconstruction has been given many labels in previous literature such as cleft construction, state-of-affairs sentence, focus construction to name but a few. the present study attempts to distinguish v-destructure from other analogous forms and suggests that it demonstrates particular syntactic properties unlike three other structure typespreviously thought to be variations of the samehomogeneous construction as v-de.  the current analysis examines two major unresolved puzzles of v-de structure: a) it has been widely recognized to yield preferred past reading and its temporal information is proposed to have been realized via the functional item de; and b) it is incompatible with future tense markers, aspectual auxiliaries le/zhe/guoand sententialle.such distinct properties lead to some inquiries about the syntax ofv-deand the functions of its constituents.this study intends to answer the following questions: a) how does verbalde yield non-future tense reading? and b) why does the structure disallow co-occurrence with the above-mentioned elements? the present study proposes an explanation to account for these syntactic properties from a formal perspective, aligning with the spirit of minimalist program (henceforth mp). it regards verbal de as a featured item in lexicon whose deictic feature could be realized when deis merged either in dº and tº. in both cases, its deicticity fulfills a general anchoring function and its specificity varies in its particular representation on different functional heads. in the analysis of v-de,the temporal reading derived in the construction could be accounted for with the deicticity of dewhen merged in tº. the syntactic process in v-de sentences is argued to be that demoves from asppº to tº and finally cº. the reasons for such an argument is theoretically supported by marantz (2013)’s concept of contextual allosemy and the evidence syntactically attested with chinese verbal deconstructions. 

       this paper first provides a brief introduction to the syntactic behavior of verbal destructure. chapter one reviews relevant literature on tense mechanism proposed in previous studieswhich serve as the groundwork for tense research. the second chapter surveys past studies from different theoretical perspectives both on verbal de and on variant forms of v-deconstructionwhose idiosyncrasy is concealed under various labels such as focus/cleft construction. chapter three discusses the particular form and characteristics of v-destructure and what does not count as v-destructure by examining their syntactic representations and pinning down the exact issues to be addressed in this paper. chapter four offers an explanation of the item deand the basic syntactic and semantic features of v-de structure. this paper not only provides a deive examination of some puzzling structuresbut also puts forward a syntactic explanation of the tense properties of v-de construction, in hope of shedding light on the inquiry of the issueson v-deas well as on tense anchoring in chinese. it meanwhileopensa window into further cross-linguistic comparison in the expression of temporal and aspectual information, thus contributing to the large body of literature on the mechanism of tense system ofanalytic languages in general and chinese in particular.

分类号:

 H04    

论文总页数:

 60    

参考文献数:

 91    

参考文献:
Adger, D. 2007. Three domains of finiteness: A minimalist perspective. Finiteness: Theoretical and Empirical Foundations. In I. Nikolaeva (ed.). 23–58. Oxford: Oxford University Press.
Baker, M. & Travis,L. 1997. Mood as verbal definiteness in a “tenseless” language. Natural Language Semantics 5(3): 213–269.
Chao, Y. R. 1968. A Grammar of Spoken Chinese. Berkeley: University of California Press.
Chappell, H. & Thompson, S. A. 1992. The semantics and pragmatics of associative DE in Mandarin Chinese discourse. Cahiers de Linguistique—Asie Orientale 21(2): 199-229.
Cheng, L. L-S. 2008. Deconstructing the shi de construction. The Linguistic Review 25, 3/4: 235–266.
Chiu, B. H. 1993. The Inflectional Structure of Mandarin Chinese. Doctoral dissertation, UCLA.
Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Press.
Comrie, B. 1985.Tense. Cambridge: University Press.
Deng, S-H. 1979. Remarks on cleft sentences in Chinese. Journal of Chinese linguistics 7 (1): 101-114.
Ehlich, K. 1982. Anaphora and deixis: same, similar, or different? In Jarvella & Klein (eds.): 315-338.
Encyclopedia of Chinese Languages and Linguistics. 2015. In R. Sybesma, W. Behr, Z. Handel, C.-T. J. Huang& J. Myers (eds.). Leiden: Brill.
Gärdenfors, P. & Brala-Vukanović, M. 2018. Semantic domains of demonstratives and articles: A view of deictic referentiality explored on the paradigm of Croatiandemonstratives. Lingua 201: 102-118.
Gerner, M. 2009. Deictic features of demonstratives: Atypological survey with special reference to the Miao group. The Canadian Journal of Linguistics / La revue canadienne de linguistique, 54(1): 43-90.
Gillon, C. 2009. Deictic features: evidence from Skwxwú7mesh. International Journal of American Linguistics 75(1): 1-27.
Grano, T. 2017. Finiteness contrasts without Tense? A view from Mandarin Chinese. Journal of East Asian Linguistics 26(3): 259–299.
Heine, B. T. K. 2002. World Lexicon of Grammaticalization. Cambridge: Cambridge University Press.
Hinzen, W.& Sheehan, M. 2013. The Philosophy of Universal Grammar. Oxford: Oxford University Press.
Huang, C-T. J. 2015. On syntactic analyticity and parametric theory. Chinese Syntax in a Cross-linguistic Perspective, In Audrey Li, Andrew Simpson & Dylan Tsai (eds.). 1-48. Oxford: Oxford University Press.
Huang, C-T. J., Li, Y -H. A. & Li Y. F. 2008. The Syntax of Chinese. Cambridge: Cambridge University Press.
Klein, W. 1994.Time in Language. London: Routledge.
Klein, W., Li, P.&Hendriks, H. 2000. Aspect and assertion in Chinese. Natural Languageand Linguistic Theory 18:723–770.
Levinson, S. C. 1983. Pragmatics. Cambridge: Cambridge University Press.
Levinson, S. C. 2004. Deixis.The Handbook of Pragmatics.In L. Horn and G. Ward (eds.). 97–121. Oxford: Blackwell.
Lin, J-W. 2000. On the temporal meaning of the verbal–le in Mandarin Chinese. Language and Linguistics 1(2):109-133.
Lin, J-W. 2002. 论现代汉语的时制意义. Language and Linguistics 3(1): 1-25.
Lin, J-W. 2003. Temporal reference in Mandarin Chinese. Journal of East Asian Linguistics 12:259–311.
Lin, J-W. 2006. Time in a language without tense: The case of Chinese. Journal of Semantics 23: 1–56.
Lin, J-W. 2010. A tenseless analysis of Mandarin Chinese revisited: A response to Sybesma 2007.Linguistic Inquiry 41:305–329.
Lin, J-W. 2012. Tenselessness. The Oxford Handbook of Tense andAspect.In R. I. Binnick (ed.). 669–695. Oxford, UK: Oxford University Press.
Lin, T-H. J. 2015. Tense in Mandarin Chinese sentences. Syntax, 18 (3): 320-342.
Lyons, J. 1977. Semantics. Cambridge: Cambridge University Press.
Marantz, A. 2013. Verbal argument structure: Events and participants. Lingua, 130:152–168.
Modine, P. 1993. A theory of evolution of the Mandarin focus construction ‘shi…de’. Asian and African Studies (2): 154-168.
Ning, C. Y. 1995. De as a functional head in Chinese. Paper presented at the Annual Forum of the Linguistic Society of Hong Kong.
Paris. M-C. 1979. Nominalization in Mandarin Chinese: The morpheme de and the shi…de construction, DRL, Universite de Paris 7, Paris.
Paul, W. 2005. Low IP area and left periphery in Mandarin Chinese. Recherches Linguistiques deVincennes 33: 111–133.
Law, P. &Ndayiragije, J. 2017. Syntactic tense from a comparative syntax perspective. Linguistic Inquiry, 48(4): 679-696.
Paul W. & WhitmanJ. 2008. Shi…de focus clefts in Mandarin Chinese. The Linguistic Review 25, 3/4: 413-451.
Pollock, J.-Y. 1989. Verb movement, Universal Grammar, and the structure of IP. LinguisticInquiry, 20, 365-424.
Pulleyblank, E. 1995. Outline of Classical Chinese Grammar. Vancouver: University of BritishColumbia Press.
Reichenbach, H. 1947. Elements of Symbolic Logic. New York: The Macmillan Company.
Ritter, E.& Wiltschko, M. 2005. Anchoring events to utterances without tense. In Proceedings ofthe 24th West Coast Conference on Formal Linguistics. In John Alderete et al. (ed.). 343-351. Somerville, MA:Cascadilla Proceedings Project.
Ritter, E. &Wiltschko, M. 2009. Varieties of INFL: TENSE, LOCATION and PERSON. Alternatives to Cartography. In Jeroen van Cranenbroeck (ed.), 153–201. Berlin: Mouton de Gruyter.
Ritter, E. &Wiltschko, M. 2014.The composition of INFL: An exploration of tense, tenseless languages, and tenseless constructions. Natural Language & Linguistic Theory 32(4): 1331–1386.
Roberts, I. 1993. Verbs and Diachronic Syntax: a Comparative History of English and French.Dordrecht: Kluwer Academic Publishers.
Simpon, A. Definiteness agreement and the Chinese DP.Language andLinguistics 2: 125–156.
Simpson, A. 2002. On the status of ‘modifying’ DE and the structure of the Chinese DP. On the Formal Way to Chinese Languages. In S-W Tang & C-S Liu (eds.). 260-285. Stadford: CSLI Publications.
Simpson, A.& Wu, Zoe X-Z. 2002. From D to T — determiner incorporation and the creation of tense. Journal of East Asian Linguistics 11: 169 - 209.
Smith, C. S. & Erbaugh, M. S. 2005. Temporal interpretation in Mandarin Chinese. Linguistics 43 (4): 713–756.
Soh, H. L., & Gao, M. 2008. Mandarin sentential -le, perfect and English already. Event Structure in Linguistic Form and Interpretation. In J. Dölling, T. Heyde-Zybatow, & M. Schäfer (eds.). 447-473. Berlin: Mouton de Gruyter.
Stowell, T. A. 1982. The tense of infinitives. Linguistic Inquiry 13:561-70.
Stowell, T. A. 1995. The phrase structure of tense. Phrase Structure and the Lexicon. In J. Rooryck & L. Zaring (eds.). 277-291. Dordrecht: Kluwer Academic Publishers.
Sybesma, R. 2007. Whether we tense-agree overtly or not. Linguistic Inquiry 38: 580–587.
Tang, T-C. 1983. Guoyu de jiaodian jiegou: fenlieju, fenlie bianju yu zhun fenlieju [Focusing constructions in Chinese: cleft sentences and pseudo-cleft sentences]. Universe and Scope. Presupposition and Quantification in Chinese. In T-C Tang, R. L. Cheng, & Y-C Li (eds.). 127 - 226. Taipei: Student book Co.
Teng, H-H. 1979. Remarks on cleft sentences in Chinese. Journal of Chinese Linguistics 7:101–113.
Tsai, W-T. D. 2008. Tense anchoring in Chinese. Lingua 118 : 675–686.
Warglien, M.&Gärdenfors, P. 2013. Semantics, conceptual spaces, and the meeting of minds. Synthese, 190: 2165-2193
Wiltschko, M. 2003. On the interpretability of tense on D and its consequences for Case Theory. Lingua113:659-696.
Wiltschko, M. 2004. Expletive categorical features: A case study of number in Halkomelem. InProceedings of NELS 35 (2).In Leah Bateman, & Cherlon Ussery (ed.). 631–646. Amherst, MA:GLSA Publications.
Wiltschko, M. 2014. The Universal Structure of Categories: Towards a Formal Typology. Cambridge University Press.
Wu, J-S. 2009. Tense as a discourse feature: Rethinking temporal location in Mandarin Chinese.Journal of East Asian Linguistics 18: 145–165.
Xu, Y. 2014. A corpus-based functional study of shi...de constructions. Chinese Language and Discourse 5(2): 146–184.
邓思颖. 2006. 以“的”为中心语的一些问题. 当代语言学(3).
李讷, 安珊笛, 张伯江. 1998. 从话语角度论证语气词“的” 中国语文(2).
李铁根 2002. “了”、“着”、“过”与汉语时制的表达. 语言研究(3).
林若望. 2017. 再论词尾“了”的时体意义.中国语文(1).
刘勋宁. 1985.现代汉语词尾“了”的语法意义. 中国语文(5).
刘勋宁. 1990. 现代汉语句尾“了”的语法意义及其与词尾“了”的联系. 世界汉语教学(2).
吕叔湘主编. 1980. 现代汉语八百词. 商务印书馆.
郭锐. 2015. 汉语谓词性成分的时间参照及其句法后果.世界汉语教学(4).
郭锐. 2016. 汉语叙述方式的改变和“了1”结句现象. 中国语文 (263).
黄正德. 1990. 說「是」和「有」.中央研究院歷史語言研究所集刊 (59).
马学良&史有为. 1982. 说“上哪儿的”及其“的”. 语言研究(1).
麦子茵. 2012. 终结性与“(是)…的”的焦点结构. 语言学论丛(44).
木村英树. 2003. “的”字句的句式语义及“的”字的功能拓展. 中国语文(4).
杉村博文. 1999. “的”字结构、承指与分类. 汉语现状与历史的研究(江蓝生、侯精一主编).中国社会科学出版社.
石毓智. 2000. 论“的”的语法功能的同一性. 世界汉语教学 (1).
石毓智. 2005. 论判断、焦点、强调与对比之关系—“是”的语法功能和使用条件. 语言研究 25 (4).
石定栩. 2008. “的”和“的”字结构. 当代语言学(4).
宋玉柱. 1981. 关于时间助词“的”和“来着”. 中国语文(4).
史有为. 1984. 表已然义的“的b”补议. 语言研究(1).
完权 2018. “的”和“的”字结构. 上海:学林出版社.
完权. 2013. 事态中的“的”. 中国语文(1).
王文颖. 2016. 现代汉语“是……的”句的焦点结构研究. 博士论文: 北京大学中国语言文学系.
袁毓林. 1995. 谓词隐含极其句法后果—“的”字结构的代称规则和“的”的语法、语义功能。中国语文(4).
袁毓林. 2003a.从焦点理论看句尾“的”的句法语义功能.中国语文(1).
袁毓林. 2003b.句子的焦点结构及其对语义解释的影响. 当代语言学 (4).
朱德熙. 1961. 说“的”. 中国语文(12).
朱德熙. 1978. “的”字结构和判断句. 中国语文(1-2).
朱德熙. 1982. 语法讲义. 北京: 商务印书馆.
朱庆祥. 2017. 也论“应该∅的”句式违实性及其相关问题.手稿.
公开日期:

 2022-06-04    

2019-05-20

供应链金融下中小企业信用评级研究 -以工程机械行业为例.孙浩

链接

题名:

 供应链金融下中小企业信用评级研究 -以工程机械行业为例    

姓名:

 孙浩    

学号:

 1701211051    

论文语种:

 chi    

专业:

 专业学 - 工程管理硕士 - 工程管理硕士    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程管理硕士    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 张宏岩    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-20    

关键词:

 供应链金融 信用评级 因子分析 Logistic回归模型    

论文摘要:

中小企业在优化经济结构和缓解就业压力等方面呈现出重要的价值,但是受到生产经营规模较小、管理模式落后等因素的制约,中小企业的融资渠道极为狭窄,融资成功率也较低,极大地限制了中小企业进一步发展壮大的步伐。与此同时,国内供应链金融随之应运而生,商业银行等金融机构帮助中小企业周转流动资金,实现多方互利共赢。然而,供应链金融存在信息不对称风险,不同的供应链金融模式所潜在的风险也具有显著差异。随着我国供应链金融行业呈现出迅猛的发展态势,商业银行在经营过程中开始面临在供应链的特殊环境下对中小企业的信用进行风险评估的问题。

本文以供应链金融的发展状况作为宏观研究背景,通过对工程机械行业供应链金融融资模式及相应模式下的风险特征的研究,筛选并优化工程机械行业供应链金融信用评价指标,量化工程机械行业供应链金融环境下中小企业潜在的信用风险。

首先,本文阐述了研究命题所涉及的相关理论内容,即供应链金融概念、融资模式类型以及相关信用评价体系等;其次,详细阐述了当前工程机械行业供应链金融下不同的融资模式的具体流程及各自的风险特征,从而为构建工程机械行业基于供应链金融环境下的信用指标体系形成良好的前提条件;最后,选取了财务数据完善的工程机械行业中新三板企业作为样本,运用因子分析法对初选的信用指标体系进行降维处理,并利用Logistic回归模型来完成基于供应链金融环境下工程机械行业中小企业信用风险评价体系的构建。

本研究构建了工程机械行业基于供应链金融的信用评价指标体系,并检验了指标体系的可行性。本指标体系的设计和实现对工程机械行业中小融资企业具有理论价值和现实意义。

分类号:

 F83    

论文总页数:

 49    

参考文献总数:

 56    

参考文献列表:
[1] 李芹,吴丝丝,霍强.中小企业融资困境与供应链金融创新研究[J].经济论坛,2014(05):61-67.
[2] 宋华.供应链金融[M].二版.北京:中国人民大学出版社,2016:8-13.
[3] 丁汀,李雪梅.供应链金融解决中小企业融资的优势分析[J].物流技术,2009(07):73-75.
[4] 李金龙.2011.供应链金融理论与实务[M].北京:人民交通出版社, 5-6.
[5] 弯红地.供应链金融的风险模型分析研究[J].经济问题,2008(11). 
[6] B. A. Ahn, S. S. Cho and C.Y Kim. The integrated methodology of rough set Theory and artificial neural network for business failure prediction. Expert Systems with Applications 2008,18(2):65-74.
[7] Dr Clarence N. W. Tan, Bond University, Gold Coast,Qld. A Study on Using Artificial Neural Networks to Develop an Early Warning Predictor for Credit Union Financial Distress with Comparison to the Probit Model[J].Managerial Finance,2011,27(4):56-77.
[8] Dadios Kumarasamy, Prakasb Singh. Access to Finance, Financial Development Countries and Firm Ability to Export: Experience from Asia-Pacific countries[J].Asian Economic Journal,2412,32(1).
[9] Guilherme Barreto Fernandes. Application of metabolic GM (1,1) model in financial repression approach to the financing difficulty of the small and medium-sized enterprises[J].Grey Systems:Theory and Application,2016,4 (2).
[10] Maldonado S, Bravo C, Lopez J, et al. Integrated framework for profit-based feature selection and SVM classification in credit scoring[J]. Decision Support Systems, 2017, (04):113-121.
[11] 曾筝.商业银行信用风险评估方法研究[J].计算机仿真,2011,28(08):372-375.
[12] 运迪,周建辉.基于改进Z值模型的企业信用风险评估与检验[J].统计与决策,2014(10):173-176.
[13] 曾玲玲,潘霄,叶曼.基BP-KMV模型的非上市公司信用风险度量[J].财会月刊,2017(18):47-55.
[14] 奚梦缘.中小企业信用指标体系构建及评估模型的最优化[J].经济问题,2018(10).
[15] Shashank Pao, Thomas J. Goldsby. Supply chain risks: a review and typology [J]. The international journal of logistics and management,2009,20(1):97-123.
[16] Demica. Supply chain finance:a third report form Demica[R]. London, UK,2009.
[17] Sunil Chopra, Peter Meindl. Suply chain management: strategy, planning and operation [M]. London, UK:Pesrson Pres,2009.
[18] Chih-Yang Tsai,On delineating supply chain cash flow under collection risk[J]. International Journal of Production Economics,2010(1):186-194.
[19] Bob Dyckman. Integrating supply chain finance into the payables process[J]. International Journal of Production Economics,2011(3):172-180.
[20] Abhijeet Ghadge, Samir Dani, Michael Chester,Roy Kalawsky. A systems approach for modeling supply chain risks [J]. Supply chain management:an international journal,2013,18(5):523-538.
[21] 张浩.基于供应链金融的中小企业信用评级模型研究[J].东南大学学报(哲学社会科学版),2008(2).
[22] 熊熊,马佳,赵文杰.供应链金融模式下的信用风险评价[J]. 南开管理评论,2009(4).
[23] 胡海青,张琅,张道宏.供应链金融视角下的中小企业信用风险评估研究——基于SVM与BP神经网络的比较研究[J].管理评论,2012(11).
[24] 夏泰凤,王红梅; 中小企业供应链融资模式的风险管理[J].经济导刊,2012(1).
[25] 郭战琴.基于供应链金融的小微企业融资模式——以第三方龙头物流企业为平台[J].金融理论与实践,2012(1):76-83.
[26] 陈长彬,盛鑫.供应链金融中信用风险的评价体系构建研究[J].福建师范大学学报(哲学社会科学版) ,2013(2).
[27] 黄静思,宋河,宋新红.供应链金融贷款风险识别与评价方法研究.金融理论与实践[J].
2014 (2):46-49.
[28] 胡慧慧,傅为忠.基于改进灰色关联度方法的互联网供应链金融风险评价[J].武汉金融.2016 (3) :51-55.
[29] 高翔,贾亮亭.基于结构方程模型的企业跨境电子商务供应链风险研究——以上海、广州、青岛等地167家跨境电商企业为例[J].上海经济研究,2016(05):76-83.
[30] Angapp Gunasekaran,Kee -hung Lai,T.C. Edwin Cheng.Responsive supplly chain:a competitive strategy in a networked economy[J]. The international journal of management science,2008,36:549-564.
[31] Bernabucci R.J. Supply chain gains from integration[J]. Financial Executive,2008,24(3):46-48.
[32] Bing Jing,Abraham Seidmann. Financing sourcing in a supply chain [J]. Decision support systems,2014,58(2):15-20.
[33] 赵亚娟,杨喜孙,刘心报.供应链金融与中小企业信贷能力的提升[J].金融理论与实践,2009(10).
[34] Bob Dyckman. Supply chain finance:risk mitigation and revenue growth [J]. Journal of corporate treasury management,2011,4(2):168-173.
[35] Camerinelli D. Supply chain finance[J]. Journal of Payments Strategy & Systems,2009,3(2):114-128.
[36] Cossin D, Hricko T. A structural analysis of credit risk with risky collateral: A methodology for haircut determination [J]. Economic Notes,2003, 32(2):243-282.
[37] 贾俊平,何晓群,金勇进.”十二五”普通高等教育本科国家级规划教材,21世纪统计学系列教材「Ml.中国人民大学出版社,2012,(05):33-57.
[38] 杨丹清.供应链金融背景下中小企业融资模式探究[J].合作经济与科技,2016(03):50-51.
[39] Chih-Yang Tsai. On delineating supply chain cash flow under collection risk [J]. International journal of production economics,2011,129(1):186-194.
[40] David A. Wuttke, Constantin Blome, Michael Henke. Focusing the financial flow of supply chains: an empirical investigation of financial supply chain management [J]. International journal of production economics,2013,145(2):773-789.
[41] Epley R. Donald, Liano Kartono,Haney Richard. Borrower risk signaling using loan-to-value ratios[J]. Journal of Real Estate Research,1996,11(1):71-86.
[42] F. Mathis, J. Cavinato. Financing the Global Supply Chain: Growing Need for Management Action [J]. Thunderbird International Business,2010,52(6):467-474.
[43] 张文春.供应链金融视角下中小企业融资路径分析[J].商业时代,2010(26):85-116.
[44] Hans -Christian Pfohl, Moritz Gomm. Supply chain finance: optimizing financial flows in supply chains [J]. Logist Research,2009(1):149-161.
[45] M. Theodore, Paul D. Hutchison. Cash-to-cash: the new supply chain management metric[J]. International journal of physical distribution & logistics management,2002,32 (4):288-298.
[46] Miao He, Changrui Ren, Qinhua Wang, Jin Dong. Chapter 3:supply chain finance:concept and modeling [C]// Feiyue Wang. Service science management and engineering. Hangzhou: Zhejiang University Press,2012:37-58.
[47] Mingsheng Yang. Research on supply chain finance pricing problem under radnom demand and permissible delay in payment[J]. Procedia computer science,2013(17):245-257.
[48] P.L. Abad,C. K. Jaggi. A joint approach for setting unit price and the length of the credit period for a seller when end demand is price sensitive [J]. International journal of production economics, 2003(83):115-122.
[49] Peter Finch. Supply chain risk management [J]. Supply chain management:an international journal,2004,9(2):183-196.
[50] Rhian Slivestro, Paola Lustrato. Integrating financial and physical supply chain:the role of banks in enabling supply chain integration [J]. International journal of operations & production management,2014,34(3):298-324.
[51] Tseng, M.L., Chiang, J.H., Lan, W.L. Selection of optimal supplier in supply chain management strategy with analytic network process and choquet integral. Comput[J]. Ind. Eng. 2009,57 (1): 330-340.
[52] Wesley S. Randall, M. Theodore Farris. Supply chain financing:using cash -to -cash variables to strengthen the supply chain [J]. International journal of physical distribution & logistics management,2009,39(8): 669-689.
[53] Shang, K.H., Song, J.S., Zipkin, P.H. Coordination mechanisms in decentralized serial inventory systems with batch ordering. Manag. Sci. 2009, 55 (4):685-695.
[54] Vickery, Jayaram, Droge & Calantone. The effect of an integrative supply chain strategy on customer service and financial performance: an analysis of direct versus indirect relationships [J]. Journal of operations management,2003,21(5):523-539.
[55] Wuttke, D.A., Blome, C., Heese, H.S., Protopappa-Sieke, M. Supply chain finance: optimal introduction and adoption decisions. Int. J. Prod. Econ. 2016,178: 72-81.
[56] Xiangjun He, Lingyun Tang. Exploration on building of visualization platform to innovate business operation pattern of supply chain finance [J]. Physics procedia,2012(33):86-93.
公开日期:

 2019-06-03    

国际视角下建筑行业协会合作对建筑职业培训效果影响的研究.田志伟

链接

题名:

 国际视角下建筑行业协会合作对建筑职业培训效果影响的研究    

姓名:

 田志伟    

学号:

 1701211055    

论文语种:

 chi    

专业:

 专业学 - 工程管理硕士 - 工程管理硕士    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程管理硕士    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 张宏岩    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2019-05-20    

外文题名:

 Research on effect of construction industry vocational training from the perspective of international cooperation among NGOs    

关键词:

 协会国际合作 建筑职业培训 博弈    

外文关键词:

 International cooperation between industry associations Vocational training Game theory    

论文摘要:

我国建筑业科技水平相对较低,从业者安全意识和专业知识相较于发达国家有所不足,导致建筑安全事故较多,给工人生命安全和经济发展带来了危害,加之国内建筑职业培训成效不足,使得目前建筑工人技能水平和职业素质达不到行业发展需要。有效的培训能够实现建筑工人专业技能和职业素养的提高。海外职业培训经验表明,良好的职业培训效果是工人、政府、企业、行业协会多方参与、良性互动、有机融合的结果。
本文首先分析了当前国内外建筑安全形势和职业培训的特点,通过对建筑企业、农民工和国外协会负责人的调研访谈,发现了我国建筑业农民工职业培训的现状及问题,对国内建筑培训成效不足的原因进行深入分析。针对存在的问题,借鉴了国际先进职业培训项目的经验,并着重分析研究了中外建筑行业协会合作开展培训项目的巨大价值和前景。作者通过全面分析英美职业培训的特点,论述协会开展国际职业培训合作对国内建筑业发展的积极意义。文章用博弈论对建筑职业培训中工人、建筑企业、协会和政府之间的博弈关系进行了分析。职业培训主要涉及政府企业间博弈、农民工和企业的博弈以及培训项目提供和参与者之间的博弈。通过加入行业协会的角色,利用协会在国际交流、信息收集、专业知识等方面的独特优势,讨论了中外协会国际合作背景下博弈结果优化的可能性,从而实现吸收英美国家职业培训经验,提高职业培训效果的目的。通过作者对国内建筑企业,建筑工人和中外行业协会的调研数据,对参与中外职业培训合作项目的中国建筑企业和农民工的收益成本进行了实证分析,并提出了改善职业培训效果,促进中国建筑业发展,提升工程安全水平的建议。
 

外文摘要:

Backwardness of construction technology and weakness of safety consciousness and the lack of professional skills among migrant workers lead to construction accidents in China,which poses severe threat to lives, families and economy in general. There is no sufficient training program for migrant workers, making this situation even worse. Because the lack of effective training programs, migrant workers do not possess necessary skills for safety, hence unable to meet the requirement of overseas construction projects. Vocational training aims to improve the skills and the knowledge for migrant workers. Study of foreign vocational training drew conclusion that effective training is a result of involvement of workers, organizations, enterprises and the government, who has positive interaction and better
integration.
At the beginning, paper introduces the trends of global construction safety and the traits of vocational training. Through questionnaire for enterprises, migrant workers and industry
associations. The author has developed comprehensive understanding of current construction industry training system and identified its existing problems. In order to address
ineffectiveness of domestic training program, factors that lead to the problems through experiences of foreign vocational training and literature review are examined. It is international cooperation among associations from different countries that can make training program meaningful and generate huge benefits for the whole industry. The author deducted reasons by thorough analysis of the training experiences in United States and United
Kingdom. With Game theory, the stakeholders of construction industry are analyzed. The main role of this model are migrant workers, construction enterprises, industry associations and the government, respectively. The paper shows that we can optimize the results of game compared to the previous model that industry association did not participate by adding the role of association into the game model of vocational training due to the unique advantages the association possesses, such as international cooperation, information gathering and professional knowledge. The study targets at improving the effectiveness of vocational
training, using collected data to empirically analyze the migrant workers, enterprises involved in the vocational training program between China and US, and UK.
Finally, The paper give some suggestions on how to improve the effectiveness of vocational training on the hope of putting forward to further promote the development of China's construction industry and the safety of engineering.

 

分类号:

 F26    

论文总页数:

 62    

参考文献总数:

 74    

参考文献列表:
陈圆,任宏.美国建筑业劳工培训剖析与启示[J]. 《建筑经济》, 2010 (9) :13-16
程贵妞,韩国明.行业协会参与职业教育的角色分析[J].教育与职业,2008(6):11-14
方东平等.英国和美国建筑安全的现状与发展[J]. 《建筑经济》, 2001 (8) :26-29
国家统计局.中华人民共和国 2018 年国民经济和社会发展统计公报[EB/OL].
http://www.stats.gov.cn/tjsj/zxfb/201902/t20190228_1651265.html
韩永光.建筑业农民工职业教育管理研究[J].中华民居(下旬刊), 2014(9) :239-240
黄浩明.社会组织国际化战略与路径研究[D].天津大学,2014
赖涪林,付春,肖升生.农民工教育培训参与主体的博弈与抉择分析[J]. 《唯实》, 2012 (10) :80-82
李洵.新加坡、英国及香港地区的建筑质量与安全分析[J]. 《土木工程学报》 , 2003 , 36 (9) :38-45
李梦白.美国汽车工程师协会(SAE)教育培训管理及课程体系简介之一——SAE 的职业培训管理[J].质量与可靠性, 2009(2):58-59
李朝.建筑业农民工安全管理研究及应用[D]. 湖南大学,2016
刘璐.英国建筑安全发展概览[J]. 《中国安全生产》 , 2015 (12)
刘志军.建筑业农民工教育培训体系构建及对策研究[D]. 东南大学,2016
刘能文.2016 年全国建筑物资租赁承包行业分析报告[R]北京:中国基建物资租赁承包协会, 2016:1-3
毛亚男.行业协会参与职业教育人才培养模式研究[D]. 天津大学,2013
牛永宁,蔡庸亨,牛新可.英国建筑安全教育培训分析与借鉴[J].《建筑安全》, 2015(11): 7-9
冉云芳.企业参与职业教育办学的成本收益分析[D]. 华东师范大学,2016
申英博.基于博弈理论的建筑安全管理研究[D]. 天津大学,2015
寺田盛纪.日本职业教育——比较与就业过程视角下的职业教育学[M].陈俊英,马丽华,译.北京:人民教育出版社,2014:25.
孙萌.非营利组织的国际化策略与资源的多重依赖——以北京某基金会为例[D]. 2012.
谭璐.中国非学历教育与个人收入关系的实证研究[J].《开放学习研究》, 2018(12): 31-36
王奕俊.企业收益成本视角的校企合作动力机制分析[J].《教育与职业》, 2011 (03) :15-17
魏体丽.澳大利亚行业技能委员会研究[D].华中师范大学,2013
许华榕.闽台行业协会交流与合作深化问题的研究[D].华侨大学,2011
许惠清,黄日强.以行业为主导的职业教育模式[J].河北师范大学学报,2011(9):79-84
徐振.基础设施项目施工企业应对“用工荒”问题的研究[D]. 清华大学,2014
徐卫.新生代农民工职业培训研究[D]. 武汉大学,2016
燕晓飞.非正规就业劳动力教育培训的多主体博弈分析[J].东北师大学报(哲学社会科学版), 2013(2) :144-147
张健.浅析行业协会的功能——基于弥补市场失灵的视角[J].理论界, 2013(6):28-30.
张沁洁.行业协会间的竞合关系演变研究——以广东为例[J]. 华南理工大学学报(社会科学版), 2018,v.20; No.102(02):77-86
郑茜.基于博弈论视角下中国农民工职业培训问题研究[J]. 《知识经济》, 2009 (14) :69-70
中华人民共和国国务院办公厅. 国务院办公厅关于加快推进行业协会商会改革和发展的若干意见.
国办发[2007]36 号[J].工程造价管理, 2007(52):3-5.
中国基建物资租赁承包协会.协会介绍 [EB/OL]. [2015-10].
http://www.ccmrc.org.cn/about.asp?id=369
中国建筑业协会.2017 年建筑业发展统计分析 [EB/OL]. [2018-01].
http://www.zgjzy.org/NewsShow.aspx?id=9146
周 丽 华 . 辅 助 原 则 与 德 国 “ 双 元 制 ” 职 业 教 育 中 经 济 组 织 的 主 体 地 位 [J]. 外 国 教 育 研究,2015(2):117-128.
朱钰.基于建筑工人认知的安全行为培训研究[D]. 清华大学,2016
赵彬,袁亮,杨希宁.建筑业农民工技能培训障碍与对策研究[J].《建筑经济》 , 2017, 38(12) :100-104
Acemoglu D, Pischke .1-S. The structure of wages and investment in general training. [J]. Journal of Political Economy. 1999107(3). 539-572
ABET.abet accreditation[EB/OL]. [2010-06].https://www.abet.org/accreditation/
Becker G S, Tomes N. Human capital and the rise and fall of families[J]. Journal of Labor Economics,1986, 4(3, Part 2):S1-S39
BEA.2017industry stat data[EB/OL]. [2018-06].
https://apps.bea.gov/industry/factsheet/factsheet.cfm
Centre for information on continuing vocational training.A bridge to the future European policy for vocational education and training 2002-10-- National policy report 一 France[DB/OLJ. March 2010/2012-06-09. p.14
Centre for information on continuing vocational education and training 2002-10-- National policy training.A bridge to the future European policy for vocational report 一 France[DB/OL]. March 2010/2012-06-09.p.27
CISRS .CISRS handbook [EB/OL]. [2016-10]. http://www.cisrs.org.uk/
Dietrich H,Koch S,Stops M.The apprenticeship places crisis: training needs to be worthwhile,including for companies.Establishment Panel Survey[R].Nuremberg, Brief Report,2004, No. 6.
Edward L. Taylor .Safety benefits of mandatory OSHA 10 h training [J].Safety Science, Volume 77,August 2015, Pages 66-71
Granger, CWJ1, Some Recent Developments in a Concept of Causality [J].Journal of Econometrics,1988,39: 199~2111
Harsanyi J C, Selten R.A Generalized Nash Solution for Two-Person Bargaining Games with Incomplete Information [J]. Management Science, 1972, 18(5-part-2):80-106.
Hinze. Analysis of Fatalities Record by OSHA. [J].Journal of Construction Engneering and Management,1995, (6): 23-25.
Hansen, Hal. Caps and Gowns [D]. University of Wisconsin-Madison, 1997.
H.Rauhut. Higher Punishment, Less Control? Experimental evidence on the inspection game .[J]Rationality and Society.2009,21(21):359-392
Hinze J, Harrison C. Safety Programs in Large Construction Firms [J]. Journal of the Construction Division, 2014, 107(3):455-467.
Health and Safety Executive. Construction: Work related injuries and ill health [EB/OL]. [2017-10].
http://www.hse.gov.uk/statistics/industry/construction/construction.pdf
Juan Carlos Rubio-Romero.Analysis of the safety conditions of scaffolding on construction sites.[J].
Safety Science, Volume 55, June 2013, Pages 160-164
JIFH. Japan International Food for the Hungry[EB/OL]. [2011-08].https://www.jifh.org/eng/activity/
Lewis W A . Economic Development With Unlimited Supplies Of Labour[J]. Manchester School, 1954,22:139-191.
Lehrack D. Environmental NGOs in China - partners in environmental governance [J]. Discussion Papers Presidential Department, 2006.
Maslow A H. Preface to motivation theory.[J]. Psychosomatic Medicine, 1943, 5(1):85-92.
Mincer J. Schooling, Experience, and Earnings. Human Behavior & Social Institutions No. 2. [M]//Schooling, experience, and earnings. 1974.
Mincer J. Human capital and economic growth. [J] Economics of Education Review, Volume 3, Issue 3,1984, Pages 195-205.
Muehlemann S.Schweri J.Winkelmann R, Wolter S C. A Structural Model of Demand for Apprentices[R].CESifo Working Paper. 2005, No.1417.
Muehlemann S,Schweri J,Winkelmann R Wolter S C. An empirical analysis of the decision to train apprentices [J]. Lab Rev Lab Econ Ind Relat, 2007, 21(3):419-441
Nash J.Two-Person Cooperative Games [J]. Econometric , 1953, 21(1):128-140.
OSHA. Introduction of OSHA [EB/OL]. [1985-10]. http://www.osha.gov/
Qualifications and Curriculum Development Agency. UK National Policy report for 2010[DB/OL].2010/2012-09-26. p.64
Ryan P.Gospel H,Lewis P.Educational and Contractual Attributes of the Apprenticeship Program of Large Employers in Britain [J]. Journal of Vocational Education and Training. 2006.58(3):359-383.
Shaked A, Sutton J. Involuntary Unemployment as a Perfect Equilibrium in a Bargaining Model[J].Econometrica, 1984, 52(6):1351-1364.
Strauss A L, Corbin J M. Grounded theory in practice [M]. Grounded theory in practice. 1997.
Starbird S A. Designing Food Safety and Penalties for Noncompliance Regulations: The Effect of Inspection Policy on Food Processor Behavior [J].Journal of Agricultural and ResourceEconomics_2000, 25 (2) :616-635.
Sou-Sen Leu, Ching-Miao Chang. Bayesian-network-based safety risk assessment for steel construction projects [J]. Accident Analysis & Prevention. 2013(54):122-133.
SAIA. Introduction of SAIA [EB/OL]. [2014-05].
https://www.saiaonline.org/aboutsaia
Sevilay Demirkesen.Construction safety personnel's perceptions of safety training practices[J].International Journal of Project Management, Volume 33, Issue 5, July 2015, Pages 1160-1169
Theodore W. Schultz, investing in people: Schooling in low income countries [J]. Economics of Education Review, Volume 8, Issue 3, 1989, Pages 219-223
Von Neumann J, Morgenstern O. Theory of Games and Economic Behavior [M]. 1953.
Wheeler N. Invited influence: American private associations in the modernization of China, 1985--2005[J].Dissertations & Theses - Gradworks, 2007.
公开日期:

 2019-06-11    

2018-11-30

中国技术写作认证考试设计与实证.阮羽

链接

题名:

 中国技术写作认证考试设计与实证    

姓名:

 阮羽    

学号:

 1401210700    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 何卫    

导师1单位:

  外国语学院    

导师2姓名:

 高志军    

导师2单位:

  软件与微电子学院    

论文答辩日期:

 2018-11-30    

外文题名:

 The Design and Verification of the Technical Writing Certification for Chinese Technical Writers    

关键词:

 技术写作 能力要求 认证考试    

外文关键词:

 Technical writing Competency requirements Certification examination    

论文摘要:

    随着中国经济水平的提升,许多企业和高校意识到技术写作在产品销售、用户满意度中占据越来越重要的地位,开始重视高校人才培养和企业人才输送,同时急需一套人才选拔的基准帮助企业寻觅人才。

    目前,欧美国家有相对完善的技术写作认证考试,例如美国技术传播协会的CPTC认证考试和德国技术传播协会的TCTrainNet认证考试。但是,把这些认证考试直接平移到中国市场是不恰当的,存在以下几个问题:第一,国外的认证考试内容不能对应中国的技术写作岗位要求及其能力要求;第二,时代和科技的进步对技术写作提出了新的要求,比如内容设计、写作要求和质量控制等。第三,国外的认证考试注重技术写作理论知识的传达,对实践操作的考核几乎没有涉及。

    针对以上问题,笔者提出了根据中国技术写作行业需求设计认证考试的研究,并明确了研究思路和方法。首先,本文主要凭借“工作任务”界定能力构成,解释被试需要掌握的能力。笔者通过企业招聘信息、行业从业人员访谈、技术写作课程和已有技术写作考试,总结得出技术写作从业人员需要掌握分析、设计、写作、质量控制、发布这五块能力。其次,笔者根据前文获得的设计依据,制定了中国技术写作认证考试的大纲,并采用专家评定法对考试大纲进行了交叉验证,验证大纲的有效性。接着,笔者根据技术写作考试内容特点,讨论了各题型的适用性,并提出了各题型的设计方法。然后,笔者根据技术写作特点和已有考试评分标准,讨论了本次研究的评价标准。最后,根据技术写作考试大纲和考试方法,笔者展开了三次实验,第一次实验对象为工作两年以上的技术写作从业人员,第二次实验对象为从事技术写作半年内的技术写作从业人员,第三次实验对象为北京大学计算机辅助翻译2017级的学生,测试结果验证了样卷的可靠性和有效性。

    研究结果表明,本次研究的技术写作认证考试大纲和考试设计方法,具备有效性、可信性和可行性。设计的考试既符合了中国市场的需求,又满足了新时代对人才的新需求。希望本文提出的考试设计能启发和鼓励更多企业和行业关注技术写作行业的发展完善和人才的培养。

外文摘要:

    As China’s economy has been improved, many enterprises and universities are aware that technical writing plays an increasingly important role in product sales and customer satisfaction. How to train technical writers and evaluate their output has become a concern.

    At present, there are many certification exams related to technical writing, such as the CPTC certificate exam of the Society for Technical Communication, and the TCTrainNet certificate exam of tekom. However, these above-mentioned exams can’t be applied to China. Firstly, the content of foreign certification exams does not always fit Chinese job requirements. Secondly, the progress of the era and technology has posed new challenges to technical writing, such as content design, writing requirements, and quality control. Thirdly, foreign certification exams focus more on the theoretical knowledge while practice assessments are barely involved.

    This paper puts forward the design of the technical writing certificate according to the demand of China's technical writing industry, and expounds research methods. To begin with, this paper defines the composition of capabilities by job analysis. Through the enterprise recruitment information, industry interviews, technical writing courses and existing technical writing certificates, five major capabilies are conclude: analysis, design, writing, quality control and release. Then, this paper determines the outline and details of the Chinese technical writing certificate. The author uses expert method to cross-validate the examination outline. Next, the author discusses the applicability of each question type according to the content of the technical writing, and puts forward the design method of each question type. Afterwards, the author discusses the evaluation criteria of this study based on the characteristics of technical writing and the existing scoring criteria. Finally, according to the previous work, the author carries out three experiments. The first type of experimental subject is technical writers who have worked for more than two years; the second type is technical writers who are engaged in technical writing for half a year, and the third is students majoring in Computer-Aided Translation in Peking University. The test results verify the reliability and validity of the sample test.

    The results prove that the design of the technical writing certificate is effective, credible and feasible. The design meets the needs of the Chinese market. The author hopes that the design proposed in this paper can inspire and encourage more enterprises and industries to pay attention to the development of the technical writing industry and talent cultivation in the technical writing industry.

分类号:

 G40    

论文总页数:

 84    

参考文献总数:

 57    

参考文献列表:
陈明庆. 考试研究方法导论[M]. 北京大学出版社, 2009.
陈宇. 职业资格考试概论[M]. 华中师范大学出版社, 2002.
陈宇. 我国职业资格证书制度的回顾与前瞻[J]. 教育与职业, 2004(1):17-19.
戴海琦. 心理测量学[M]. 高等教育出版社, 2015.
郭伟萍. 英国职业资格证书制度的研究[D]. 天津大学, 2005.
黄锐. 标准参照语言测试研究[M]. 厦门大学出版社, 2012.
中国技术传播联盟. 2017中国技术传播发展现状调查报告[DB/OL]. http://www.tc-china.org/2017中国技术传播发展现状调查报告/,2018.
李梅. 技术传播性质课程的设计与实现探索——以同济大学实用英语写作课为例[J]. 上海理工大学学报(社会科学版), 2017, 39(2):101-107.
李金波. 让考试更科学[M]. 武汉大学出版社, 2012.
李清华. 高校英语专业四级测试写作评分标准的设计与效度研究[M]. 科学出版社, 2014.
李双燕. 中国技术传播教育研究浅述[J]. 文化与传播, 2015(6).
柳博. 考试命题制度研究[M]. 高等教育出版社, 2017.
吕忠民. 职业资格制度概论[M]. 中国人事出版社, 2011.
苗菊, 高乾. 构建MTI教育特色课程——技术写作的理念与内容[J]. 中国翻译, 2010(2):35-38.
史庆. 英国的国家职业资格证书制度[J]. 全球教育展望, 1997(6):47-52.
陶百强, 陈效. 我国高考英语考试大纲(说明)的问题与思考[J]. 教育与考试, 2008(4):29-34.
田大洲. 我国职业资格证书制度研究[D]. 首都经济贸易大学, 2004.
徐奇智, 王希华. 技术传播学:美国的发展对我们的启示[C]// 亚太地区媒体与科技和社会发展研讨会. 2006.
杨惠中,C.Weir. 大学英语四、六级考试效度研究[M]. 上海外语教育出版社, 1998.
杨惠中, 朱正才, 方绪军. 中国语言能力等级共同量表研究: 理论, 方法与实证研究[J]. M]. 上海: 上海外语教育出版社, 2012.
杨延. 国家职业资格认证考试的国内外比较研究[J]. 职教论坛, 2006(5s):46-49.
俞敬松, 王惠临, 王聪. 翻译技术认证考试的设计与实证[J]. 中国翻译, 2014(4):73-78.
张凯. 汉语水平考试(HSK)研究[M]. 商务印书馆, 2006.
中华人农民共和国国家质量监督检验检疫总局中国,国家标准化管理委员会. 说明书的编制构成内容和表示方法[DB/OL]. 中国标准出版社,2005.
中兴通讯学院. 科技文档写作实务[M]. 人民邮电出版社, 2013.
周海银. 教学测量与评价[M]. 济南:山东大学出版社,2015:5.
Albers M J, Mazur B. Content and Complexity: Information Design in Technical Communication[M]. L. Erlbaum Associates Inc. 2005.
Azuma M, Coallier F, Garbajosa J. How to apply the Bloom taxonomy to software engineering[C]// Eleventh International Workshop on Software Technology and Engineering Practice. IEEE, 2003:117-122.Bachman L F. Fundamental considerations in language testing[J]. 1990, 75(4).
Blythe S, Lauer C, Curran P G. Professional and technical communication in a web 2.0 world[J]. Technical Communication Quarterly, 2014, 23(4): 265-287.
Brumberger E, Lauer C. The evolution of technical communication: An analysis of industry job postings[J]. Technical Communication, 2015, 62(4): 224-243.
Carey M, Lanyi M F, Longo D, et al. Developing Quality Technical Information: A Handbook for Writers and Editors[M]. IBM Press, 2014.
Carroll J. Minimalism beyond “The Nurnberg Funnel”.[J]. Computers & Human Interaction, 1998.
Coe M. Human factors for technical communicators[M]. John Wiley & Sons, Inc. 1996.
Donald A. Norman. Emotional Design[J]. Ubiquity, 2004, 2004(45):1-1.
Cunningham D. Core competency skills for technical communicators[C]// Professional Communication Conference, 2008. IPCC 2008. IEEE International. IEEE, 2008:1-6.
Gao, Z., Yu, J., & De Jong, M. (2014). Establishing technical communication as a professional discipline. Tcworld, 2014(08), 10–13.
Glaser, R., & Klaus, D.J. (1962). Proficiency measurement: Assessing human performance. In R.M. Gagné (Ed.),Psychological principles in system development. New York: Holt, Rinehart and Winston.
Hackos J A T. Managing Your Documentation Projects[M]. 1994.
Hackos J A T, Redish J. User and task analysis for interface design[J]. 1998.
Harvey, R. J. (1991). Job analysis. In M. Dunnette & L. Hough (Eds.), Handbook of industrial and organizational psychology (2nd ed., Vol. 2, pp. 71–163). Palo Alto, CA: Consulting Psychologists Press.
Henze B, Miller C, Carradini S. Technical Communication[J]. 2016, BTR-7(3):7-7.
Johnsonsheehan R. Technical Communication Today[M]// Technical communication today. Longman, 2010:256–260.
Krathwohl D R. A revision of Bloom's Taxonomy: an overview - Benjamin S. Bloom, University of Chicago[J]. Theory Into Practice, 2002(Autumn).
Mark R. Raymond. Job Analysis and the Specification of Content for Licensure and Certification Examinations[J]. Applied Measurement in Education, 2001, 14(4):369-415.
Markel M. Technical Communication: Update 2002[M]. Boston: St. Martin's, 2002.
McDowell E E. Certifying Technical Communicators in the 21st Century[J]. 2001.
Nugent J. Certificate programs in technical writing: Through sophistic eyes[J]. Design discourse: Composing and revising programs in professional and technical writing, 2010: 153-170.
Nugent J. A survey of US certificate programs in technical communication[J]. Programmatic Perspectives, 2013, 5(1): 58-85.
O Hara F M. A brief history of technical communication[C]//ANNUAL CONFERENCE-SOCIETY FOR TECHNICAL COMMUNICATION. UNKNOWN, 2001, 48: 500-504.
Pruitt J, Adlin T. The persona lifecycle: keeping people in mind throughout product design[M]. Elsevier, 2010.
Rainey K T, Turner R K, Dayton D. Do curricula in technical communication jibe with managerial expectations? A report about core competencies[C]// Ipcc 2005. Proceedings. International Professional Communication Conference. IEEE, 2005:359-368.
Roy K. Turner, Kenneth T. Rainey. Certification in Technical Communication[J]. Technical Communication Quarterly, 2004, 13(2):211-234.
Rubin J, Chisnell D. Handbook of Usability Testing 2nd Edition[J]. Wiley Publishing Inc, 2008.
Sauro J, Lewis J R. Quantifying the User Experience[M]. 2012.
Spencer D. Practical Guide to Information Architecture[J]. 2010.
Thompson I. Competence and critique in technical communication: A qualitative content analysis of journal articles[J]. Journal of Business and Technical Communication, 1996, 10(1): 48-80.
Turner R K, Rainey K T. Certification in technical communication[J]. Technical communication quarterly, 2004, 13(2): 211-234.
公开日期:

 2018-11-30    

医学英语词汇学习系统研究与设计.荣岩

链接

题名:

 医学英语词汇学习系统研究与设计    

姓名:

 荣岩    

学号:

 1501210657    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 姚亚芝    

导师1单位:

 软件与微电子学院    

导师2姓名:

 俞敬松    

导师2单位:

 软件与微电子学院    

论文答辩日期:

 2018-11-30    

外文题名:

 Research and Design of a Medical English Vocabulary Learning System    

关键词:

 医学英语词汇 词汇学习效率 自适应推荐 词汇记忆网 词汇复现    

外文关键词:

 medical English vocabulary vocabulary learning efficiency adaptive recommendation vocabulary memory networks vocabulary repetition rate    

论文摘要:

      医学英语词汇与普通英语词汇不同,有其独特的构词方式,词素特征明显,词汇间的关联性更强。经调研,目前的医学英语词汇学习资源不能完全满足学习者的学习需求,教学资源以纸质材料为主,局限性较大,存在医学英语词汇学习效率不高、学习内容有限、学习者积极性不高等问题,缺乏有效的医学英语词汇教学体系。

      聚焦到医学英语词汇学习效率的问题,以下三个方面仍未得到有效解决。一、目前医学英语词汇教学忽视了学习者的个体差异,现有教学方式难以根据每位学习者的学习情况进行医学词汇学习与复习动态推荐。二、现有英语词汇学习软件未能充分挖掘医学英语词汇特征,未能把握医学英语词汇教学重点,教学流程不完全适用于医学英语词汇。三、医学英语词汇复现率较低,学习者记忆效果不佳。为解决上述问题,本研究以医学英语词汇教学理论和第二语言习得理论为依据,利用移动互联网优势,设计了一款医学英语词汇学习系统,提出以下三种提高医学英语词汇学习效率的方式。一、建立医学英语词汇自适应推荐模型,通过综合计算医学英语词汇的特征影响因子实现医学英语词汇动态推荐。二、依据医学英语词汇特点,精选教学内容模块,突出医学英语词汇教学重点,优化教学流程,构建医学英语词汇记忆网。三、多维度复现词汇,通过多种词汇复现方法,增加词汇的复现率。

      为了验证本研究设计的医学英语词汇学习系统的教学效果,本研究对北京某学校50名大二非英语专业学生进行了对照教学实验。实验表明,本研究设计的学习方式可有效促进医学英语词汇与词素的习得效果和保持效果,且可以提高学习者的猜词能力。

      本研究设计的医学英语词汇学习系统有效提高了医学英语词汇学习效率,缓解了纸质医学资源的局限性,补足了医学英语词汇课堂教授内容受限的短板,可满足学习者的个性化需求,并注重通过多样化学习方式培养学习者的学习兴趣与积极性,对医学英语词汇移动教学具有一定的参考价值。

外文摘要:

      Different from common English vocabulary, Medical English vocabulary has specific ways of word formation, and the semantic association among them is stronger than common English vocabulary. Moreover, the morpheme feature of medical English vocabulary is distinct. By surveying medical English learners, the author finds out that current medical English vocabulary learning resources cannot fully meet their learning needs. The teaching resources are mainly paper materials which have lots of limitations. At present, there are many problems concerning medical English vocabulary learning, such as low learning efficiency, limited learning materials and low learning initiative, and there is a lack of an effective medical English vocabulary learning system.

      Focusing on the low learning efficiency of medical English vocabulary, the following three aspects have not been effectively solved. First, current medical English vocabulary teaching ignores individual differences of learners, and the existing teaching methods cannot dynamically recommend medical vocabulary according to each learner’s situation. Second, the existing English vocabulary learning softwares fail to fully utilize the features of medical English vocabulary and grasp the key points of medical English vocabulary teaching, whose teaching processes are not fully applicable to medical English vocabulary. Third, the repetition rate of medical English vocabulary is low, which causes poor learning effect. In order to solve the above problems, based on medical vocabulary teaching theory and second language acquisition theory, taking advantages of the mobile internet, this study designs a medical English vocabulary learning system and proposes the following three ways to improve medical English vocabulary learning efficiency. First, this study establishes an adaptive recommendation model for medical English vocabulary, and realizes dynamic recommendation of medical English vocabulary by comprehensively calculating the influencing factors of medical English vocabulary. Second, based on the distinct features of medical English vocabulary, this study designs appropriate teaching modules, highlights the teaching focus of medical English vocabulary, optimizes the teaching processes and builds medical English vocabulary memory networks. Third, this study designs various ways to increase the repetition rate of medical English vocabulary.

      In order to verify the teaching effect of the system, this study conducted a comparative experiment on 50 sophomores of non-English majors of a university in Beijing. The experiment result shows that the learning methods can effectively promote the acquisition and retention effect of medical English vocabulary and morphemes, and can promote learners’ ability of guessing words.

      The medical English vocabulary learning system effectively improves medical English vocabulary learning efficiency, alleviates the limitations of medical paper resources, complements limited teaching content of traditional medical English vocabulary lessons, meets learners’ personalized needs, and pays attention to cultivating learners' learning interest and initiative through diversified learning methods. The system has certain reference value for medical English vocabulary mobile teaching.

分类号:

 H08    

论文总页数:

 82    

参考文献总数:

 66    

参考文献列表:
[1] Wilkins D A. Linguistics in language teaching [M]. London: Edward Arnold. 1972.
[2] 张燕, 吴新炜, 张顺兴. 我国高等医学院校医学英语教学现状调查与分析[J]. 中国高等医学教育, 2006(8): 29-30.
[3] 王连柱. 论高频医学词汇的筛选与医学英语教学[J].中国医学教育技术, 2011, 25(2): 217-220.
[4] 刘萍, 刘座雄. 基于ESP语料库的学术英语词汇学习法的有效性研究[J]. 外语研究, 2018(3): 54-60.
[5] Sinclair S, Renouf A. A lexical syllabus for language learning [M]. // Carter, R. & McCarthy, M. Vocabulary and language teaching. London and NewYork: Longman, 1988: 142-143.
[6] Richard J C. A psycholinguistic measure of vocabulary selection [J]. Iral, 1969, 8(2):87-102.
[7] O’ Gorman E. An investigation of the mental lexicon of second language learners [J]. The Irish yearbook of applied linguistics, 1996, (16):15-31.
[8] 马雁. ESP理论视角下的医学英语课程设置及其教学探索[J]. 外语电化教学, 2009(1): 60-63.
[9] 王国良. ESP还是EGP——普通医学院校大学生对医学英语教学看法的调查研究[J]. 中国医学教育技术, 2014(2): 215-220.
[10] Strevens P. ESP after twenty years: A re-appraisal [A]. In M Tickoo (ed.). ESP: State of the Art [C]. Singapore: SEAMEO Regional Language Centre.1998.
[11] Hutchinson T, Waters A. English for specific purposes: A learning-centered approach [M]. Cambridge: Cambridge University Press, 1998:1-10.
[12] 丁青年. 医学英语与英语医学[J]. 上海中医药杂志, 2002(12): 40-41.
[13] Nation I S P. Learning vocabulary in another language [M]. Cambridge: Cambridge University Press.2001.
[14] Gylys B A, Wedding M E. Medical Terminology: A System Approach [M]. Philadelphia: F. A. Davis. 1983.
[15] 沈姝. 从英语词源角度分析医学英语词汇特点[J]. 医学教育探索, 2007, 6(4):329-330.
[16] Schmitt N, M McCarthy. Vocabulary: description, acquisition and pedagogy [M]. Cambridge: Cambridge University Press. 1997.
[17] 陈琦, 高云. 学术英语中的半技术性词汇[J]. 外语教学, 2010, 31(6): 42-46.
[18] 秦秀白. ESP的性质、范畴和教学原则[J]. 华南理工大学学报(社会科学版), 2003, 5(4): 79-83.
[19] 蔡基刚. ESP与我国大学英语教学发展方向[J]. 外语界, 2004, (2): 22-28.
[20] 杨慧中. EAP在中国:回顾、现状与展望[R]. 中国ESP研究高端论坛. 北京外国语大学. 2010.
[21] 华瑶. 医学英语核心词汇的筛选和教学[J]. 医学教育管理, 2016, 2: 36-38.
[22] 李定均. 医学英语词汇学[M]. 上海: 复旦大学出版社. 2006.
[23] 黄远振. 词的形态理据与词汇习得的相关性[J]. 外语教学与研究, 2001, 33(6): 430-435.
[24] 李媛媛. 注意假说视角下词的形态理据对二语词汇习得的影响研究[D]. 扬州大学. 2017.
[25] Yang M N. Nursing pre-professionals’ medical terminology learning strategies [J]. Asian EFL Journal, 2005, 7(1): 137-154.
[26] Brown C, M E Payne. Five essential steps of processes in vocabulary learning [C]. Paper presented at the TESOL Convention, Baltimore, MD. 1994.
[27] Richards J. The Role of Vocabulary Teaching [J]. TESOL Quarterly. 1976, 10(1): 77-89.
[28] Sokmen A J. Word association results: a window to the lexicon of ESL students [J]. JALT Journal, 1993, 15(2): 135-150.
[29] Wray A. Formulaic language and the lexicon [M]. Cambridge: Cambridge University Press. 2005.
[30] Pitts M, White H, Krashen S. Acquiring second language vocabulary through reading: a replication of the clockwork orange study using second language acquirers [J]. Reading in a Foreign Language, 1989, 5(2), 271-275.
[31] Nist S L, Olejnik S. The role of content and dictionary definitions on varying levels of word knowledge [J]. Reading research quarterly, 1995, 172-193.
[32] Palmberg R. Computer games and foreign-language vocabulary learning [J]. Elt Journal, 1988.42(4): 247-252.
[33] Laufer B. Corpus-based versus lexicographer examples in comprehension and production of new words [M]. // Fontenelle T. Practical Lexicography. Oxford: Oxford University Press. 2008: 71-76.
[34] 赵海威. 基于行为特征和数据分析的外语词汇学习模型研究[D]. 北京大学. 2017.
[35] Nation P, R Waring. Vocabulary size, text coverage and word lists [M]. In N Schmitt, M McCarthy. Vocabulary Description Acquisition Pedagogy. 1997.
[36] West M. A general service list of English words [M]. London: Longman, 1953.
[37] Chung T M, Nation P. Identifying technical vocabulary [J]. System, 2004, 32(2): 251-263.
[38] Chujo K, Utiyama M. Selecting level-specific specialized vocabulary using statistical measures [J]. System, 2006, 34(2): 255-269.
[39] Schmidt R. The role of consciousness in second language learning [J]. Applied linguistics, 1990, 11(2):37-41.
[40] Ellis R. SLA research and language teaching [M]. Oxford: Oxford University Press. 1997.
[41] Swain M. Three functions of output in second language learning [A]. In G Cook and B Seidlhofer (eds.). Principle and practice in applied linguistics [C]. Oxford: Oxford University Press. 1995.
[42] Nation I S P. Teaching and Learning Vocabulary [M]. Boston: Heinle & Heinle Publishers. 1990.
[43] Laufer B. The development of passive and active vocabulary in second language: Same or different? [J]. Applied linguistics, 1998, 19: 255-271.
[44] Atkinson R C, R M Shiffrin. Human memory: A proposed system and its control process [J]. Psychology of learning and motivation, 1968, 2: 89-195.
[45] Craik F I M, R S Lockhart. Levels of processing: A framework for memory research [J]. Journal of verbal learning and verbal behavior, 1972, 11(6): 671-684.
[46] 张庆宗, 吴喜燕. 认知加工层次与外语词汇学习——词汇认知直接学习法[J]. 现代外语, 2002, 25(2):176-186.
[47] Nunan D. Language teaching methodology [M]. London: Prentice Hall International Ltd. 1991.
[48] 张烨, 邢敏, 周大军. 非英语专业本科生英语词汇学习策略的调查[J]. 解放军外国语学院学报, 2003, 26(4): 44-48.
[49] Craik F I M, E Turlving. Depth of processing and the retention of words in episodic memory [J]. Journal of experimental psychology, 1975, 104(3): 268-294.
[50] Laufer B, J Hulstijn. Incidental vocabulary acquisition in a second language: The construct of task-induced involvement [J]. Applied linguistics, 2001, 22(1): 1-26.
[51] Collins A M, M R Quillian. Retrieval time for semantic memory [J]. Journal of verbal learning and verbal behavior, 1969, 8(2): 240-247.
[52] Smith E E, E J Shoben, L J Rips. Structure and process in semantic memory: A featural model for semantic decisions [J]. Psychological review, 1974, 81(3): 214-241.
[53] Collins A M, E F Loftus. A spreading-activation theory of semantic processing [J]. Psychological review, 1975, 52(6): 407-428.
[54] 李晓丽. 中国英语学习者心理词库中的二语语义网络探究[J]. 牡丹江大学学报, 2017, 26(2): 112-117.
[55] 陈仕品, 张剑平. 适应性学习支持系统的学生模型研究[J]. 中国电化教育, 2010, (5): 112-117.
[56] Chen C M, C J Chung. Personalized mobile English vocabulary learning system based on item response theory and learning memory cycle [J]. Computer & Education, 2008, 51(2):624-645.
[57] Jaeyoung J, S Graf. An approach for personalized web-based vocabulary learning through word association games [C]. International Symposium on Applications and the Internet, 2008: 325-328.
[58] 孙明庆. 基于模糊逻辑的自适应学习系统的研究与实现——以高中英语词汇为例[D]. 湖北大学. 2017.
[59] 赵艳平. 高中英语词汇自适应学习系统的设计与开发[D]. 山东师范大学. 2015.
[60] Coffey B. State of the art article -- ESP: English for specific purposes [J]. Language teaching, 1984, 17(1): 2-16.
[61] Robinson P. ESP today: A practitioner’s guide [M]. New York & London: Prentice Hall International (UK) Ltd. 1991.
[62] O' Malley J, A Chamot. Learning strategies in second language acquisition [M]. Cambridge: Cambridge University Press. 1990.
[63] 章国英, 方卫, 李平. 医学英语听力课程主题网站的建设与实践[J]. 中国医学教育技术, 2006, 20(3): 197-199.
[64] 李红, 田秋香. 第二语言词汇附带习得研究[J]. 外语教学, 2005, 26(3): 52-56.
[65] 赵秀红, 聂建中. 合理删词完形填空与阅读能力的关系研究[J]. 教育理论与实践, 2010, 30(4): 56-58.
[66] Lee J J, Hammer J. Gamification in education: What, how, why bother? Academic exchange quarterly, 2011, 15(2): 1-5.
公开日期:

 2018-11-30    

基于多模态理论和图式理论的雅思听说学习系统的研究与设计.周璇

链接

题名:

 基于多模态理论和图式理论的雅思听说学习系统的研究与设计    

姓名:

 周璇    

学号:

 1501210821    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 张宏岩    

导师1单位:

 软件与微电子学院    

导师2姓名:

 高志军    

导师2单位:

  软件与微电子学院    

论文答辩日期:

 2018-11-30    

外文题名:

 Research and Design of IELTS Listening and IELTS Speaking Preparation Application Based on Schema Theory and Multimodality Theory    

关键词:

 雅思听力 雅思口语 图式理论 多模态理论    

外文关键词:

 IELTS Listening IELTS Speaking Schema Theory Multimodality Theory    

论文摘要:

“雅思考试”是为准备到以英语为交流语言的国家学习、就业或定居的人们设置的一项英语语言水平测试,包含听、说、读、写四个部分。本文集中研究学术考试中的听说部分。随着越来越多的学生选择赴国外求学,雅思考试的热度不断攀升。市面上,各种各样的雅思备考软件也应运而生,以期帮助学生备考雅思。然而,大多数软件只是作为一个题目资源库而存在,仅注重题目练习,忽视了对学生英语听力理解能力和英语口语表达能力的提升。

在备考雅思听说考试过程中,学生也遇到了诸多困难,往往练习了很多套真题,但是成绩依旧未能提高。究其原因,是因为学生只是一味地盲目刷题对答案,遇到的种种困难未能得到解决。从题目练习整个过程来看,学生遇到的困难主要有以下几点:一、做题前,学生未能获得足够多的可理解性输入。二、做题中,未能准确掌握雅思题目的答题技巧。三、做题后,学生未能获得足够的反馈信息;未能及时对错误进行总结分析,并针对性地安排练习;未能接受针对雅思听说考试的进一步技能提升训练。

本系统针对雅思听力和雅思口语的考试特点,从学生遇到的困难出发,评析当前相关教学系统,基于多模态理论和图式理论,结合相关教学实践和移动学习的特点,研究与设计了雅思听说学习系统。雅思听力学习系统中,做题前安排听力词汇和同义替换的学习与测试,做题后多模态方式展示听力原文,提供听写练习和听力原文学习。雅思口语学习系统中,做题前安排口语词汇的学习与测试和复杂句型学习,做题中,安排答题技巧学习,做题后依照答题框架多模态展示范文,提供雅思范文学习。基于以上设计,本文选取本系统中设计的学习方案与以往学习系统中的学习方案进行对比,通过实验、调查问卷和数据分析等方式在认知负荷和学习目标达成情况方面对本系统提出的学习方案进行了验证,证明了本系统设计的学习方案在认知负荷相似的情况下,更有利于学生达成学习目标。

本文设计的系统,做题前,帮助学生获得足够的可理解性输入;做题中,建立和强化答题技巧图式;做题后,解决存在的错误和问题,帮助学生获得足够的反馈和技能提升训练,有助于增强学生对于知识的内化程度,帮助学生形成一个良性的做题循环,发挥每一套真题的价值,达到在题目练习过程中逐步提升成绩的同时,真正提升英语听力理解能力和口语表达能力。

外文摘要:

The International English Language Testing System (IELTS) is the world’s most popular English language proficiency test for higher education and global migration, which assesses all English skills including reading, writing, listening and speaking. This paper focuses on the listening and speaking part of IELTS Academic. As more and more students choose to study abroad, the IELTS test is becoming increasingly popular. As a result, a variety of IELTS preparation systems have been developed to help students prepare for IELTS test. However, most existing systems only work as a repository with a focus on taking IELTS exercises, which ignores the cultivation of students’ listening ability and speaking ability.

In the process of preparing for the exam, the students also encountered many difficulties. Usually, they have done a lot of exercises, but the results still failed to improve. The reason is because the students just blindly do exercises and check answers. And the difficulties encountered were not solved. From the point of view of the process of doing exercises, the students have the following difficulties: First, before doing exercises, students failed to obtain enough comprehensible input. Second, when doing exercises, students have not accurately grasped the answering technique of the IELTS. Third, after exercises, except insufficient feedback, error analysis and corresponding exercises, they failed to accept further skills training for the IELTS listening and speaking test.

Based on Chinese students’ problems in preparation for the IELTS listening and IELTS speaking test, the analyses of the existing systems and the characteristics of IELTS listening and IELTS speaking test, this system is designed on the basis of Multimodality Theory and Schema Theory, combined with relevant teaching practices and the characteristics of mobile learning. In the IELTS listening learning system, listening vocabulary and synonyms are studied and tested before exercises. After exercises, the transcript is displayed in a multi-modal way, and dictation exercises and the learning of transcripts are provided. In the IELTS speaking learning system, the learning and testing of spoken vocabulary and the learning of complex sentence patterns are arranged before exercises. When doing exercises, the learning of answer techniques are provided. After exercises, the modal essay is displayed in accordance with the answer frame and in a multi-modal way, and the learning of these modal essays are provided. Based on the above design, this paper compares the learning scheme of this system with those of the existing systems. Through experiments, questionnaires and data analysis, the learning schemes proposed by the system was verified in terms of cognitive load and learning goal achievement. It is proved that the cognitive load of the learning scheme of this system designed in this paper is similar to those of the existing systems, but it is more conducive to students' achievement of learning goals.

This system helps the students obtain enough comprehensible input before doing exercises, create and strengthen the answering technique patterns when doing exercises, solve the existing errors and problems, obtain sufficient feedback as well as further skill training after exercises. As a result, this system could enhance students’ internalization of knowledge, help students form a virtuous cycle of doing exercises, and thus gradually improve the performance of English listening, and speaking while improving students’ grades in IELTS test.

分类号:

 G43    

论文总页数:

 111    

参考文献总数:

 85    

参考文献列表:
白丽. 2015. 心理信息加工模式下雅思听力教学内容的研究[硕士学位论文]. 哈尔滨师范大学.
曹怡鲁. 1999. 外语教学应借鉴中国传统语言教学经验[J]. 外语界, 2: 17.
曹治. 2017. 多模态视角下大学英语口语教学模式的实证研究[硕士学位论文]. 西安外国语大学.
崔旻, 周春芳. 2015. 多媒体呈现方式在外语词汇直接学习中的效果研究. 解放军外国语学院学报, 38(03): 88-95.
董卫, 付黎旭. 2003. 背诵式语言输入在大学英语教学中的作用. 外语界, 04: 56-59.
范琳, 王庆华. 2002. 英语词汇学习中的分类组织策略实验研究[J]. 外语教学与研究, 03: 209-212.
范琳, 王震. 2014. 词汇重复模式理论与基于语篇语境线索的词汇推理策略. 山东外语教学, 35(05): 54-60.
郭纯洁. 2007. 有声思维法. 北京:外教学与研究出版社.
顾曰国. 2007. 多媒体、多模态学习剖析. 外语电化教学, 02: 3-12.
黄荣怀, Jyri Salomaa. 2008. 移动学习——理论·现状·趋势. 北京: 科学出版社.
侯云红. 2013. 大学英语课堂复合式听写练习对听力水平的作用[硕士学位论文]. 延边大学 .
何蓉. 2011. 关于雅思口语考试第三部分若干解决方案的探讨. 西南民族大学学报(人文社会科学版), 32(S2): 174-176.
胡永近, 张德禄. 2013. 英语专业听力教学中多模态功能的实验研究. 外语界, 05: 20-25+44.
胡壮麟. 2007. 社会符号学研究中的多模态化. 语言教学与研究, 01: 1-10.
贾冠杰. 2006. 二语习得论. 南京: 东南大学出版社.
李传益.2014. 复述式语言输入对英语听说能力有效性实证研究[J]. 当代外语研究, 07: 44-49.
龙宇飞, 赵璞.2009. 大学英语听力教学中元认知策略与多模态交互研究[J]. 外语电化教学, 04: 58-62+74.
骆雁雁.2009. 基于语块理论的大学英语词汇教学模式研究[J]. 外语学刊, 06: 168-170.
毛佳玳, 蔡慧萍.2016. 基于语类的大学英语口语教学模式应用研究[J]. 外语界, 03: 89-96.
戚焱, 蒋玉梅, 朱雪媛. 2015. 大学英语口语教学中词块教学法的有效性研究. 现代外语, 38(06): 802-812+873-874.
孙燕. 2013. 雅思听力考试应试策略. 海外英语, 04: 85-86.
束定芳, 庄智象. 1996. 现代外语教学一理论、实践与方法. 上海: 上海外语教育出版社.
文秋芳. 2008. 输出驱动假设与英语专业技能课程改革. 外语界, 02: 2-9.
文秋芳. 2013. 输出驱动假设在大学英语教学中的应用: 思考与建议. 外语界, 06: 14-22.
文秋芳. 1995. 英语学习策略论. 上海: 上海外语教育出版社 .
王家义. 2012. 基于语料库的英语词汇教学: 理据与应用. 外语学刊, 04: 127-130.
王丽. 2007. 三种大规模标准化英语考试听力测试部分之比较:——一项基于语篇、任务、说话人相关因素的研究. 外语电化教学, 02: 67-72.
汪梅. 2016. 图式理论在高中英语词汇教学中的应用研究[硕士学位论文]. 上海师范大学.
王巍. 2010. 图式理论在高中英语词汇教学的应用研究[硕士学位论文]. 东北师范大学 .
武晶晶. 2013. 朗读在高职非英语专业英语听力教学中的应用[硕士学位论文]. 湖北大学.
吴延国. 2011. 《二语研究中的有声思维法争议》评述. 外语界, 4:93-96.
徐冉. 2017. 最佳教学实践指导下的英语词汇学习系统前端设计与实现[硕士学位论文]. 北京大学.
杨超. 2017. 最佳教学实践指导下的英语听力学习系统的前端设计与实现[硕士学位论文]. 北京大学.
杨映春.2013. 基于图式理论的专业英语听力教学模式实验研究. 广东外语外贸大学学报, 24(05): 96-100.
叶家春, 曾杰. 2016. 英语词汇教学的多模态—认知策略模式. 教育评论, 08: 127-130.
张德禄. 2009. 多模态话语分析综合理论框架探索. 中国外语, 6(01): 24-30.
张彤彤. 2016. 中外合作办学项目的雅思口语教学研究——基于图式理论的教学法初探. 海外英语, 06: 48-50.
张燕燕. 2015. 基于图式理论的英语口语教学模式探析. 求索, 11: 189-192.
张烨, 邢敏, 周大军. 2003. 非英语专业本科生英语词汇学习策略的调查. 解放军外国语学院学报, 04: 44-48.
朱湘华. 2010. 大学英语听力策略训练模式与效果分析. 外语研究, 02: 53-58.
朱永生. 2007. 多模态话语分析的理论基础与研究方法. 外语学刊, 05.
周相利. 2002. 图式理论在英语听力教学中的应用. 外语与外语教学, 10: 24-26
Brown, H. D. 1994. Teaching by Principles: An Interactive Approach to Language Teaching[M]. Englewood Cliff, NJ: Prentice Hall.
Bhatia, V. K. 2014. Analysing genre: Language use in professional settings. Routledge.
Carrell, P. L., & Eisterhold, J. C. 1983. Schema theory and ESL reading pedagogy[J]. TESOL quarterly, 17(4), 553-573.
Chamot, A. U. 1988. A study of learning strategies in foreign language instruction: Findings of the longitudinal study.
Cohen, A.D. 1998. Strategies in Learning and Using a Second Language [M]. London: Longman,
Cook, G. 1989. Discourse[M]. Oxford : Oxford University Press.
Duncker, K., & Lees, L. S. 1945. On problem-solving. Psychological monographs, 58(5), i.
Eggins, S. 1994. An introduction to systemic functional linguistics[M]. London: Printer.
Ericsson, K. A.& Simon, H. A. 1984. Protocol Analysis: Verbal Reports as Data. Cambridge: The MIT Press.
Faerch, C., & Kasper, G. 1987. Introspection in second language research (Vol. 30). Multilingual Matters Limited.
Flowerdew, J. 1993. An educational, or process, approach to the teaching of professional genres. ELT journal, 47(4), 305-316.
Forceville, C. 2009. Non-verbal and multimodal metaphor in a cognitivist framework: Agendas for research[A]. In Forceville, C. & E. Urios-Aparisi (eds.). Multimodal Metaphor-Application of Cognitive Linguistics[C]. New York: Mouton de Gruyter.
Gerjets, P., Scheiter, K., & Catrambone, R. 2004. Designing instructional examples to reduce intrinsic cognitive load: Molar versus modular presentation of solution procedures. Instructional Science, 32(1-2), 33-58.
Gough, P. B., Juel, C., & Griffith, P. L. 1992. Reading, spelling, and the orthographic cipher. Reading acquisition, 35-48.
Halliday, M.A.K. 1985. An Introduction to Functional Grammar[M]. London: Edward Arnld
Harmer, J. 1983. The practice of English language teaching. Longman, 1560 Broadway, New York, NY 10036.
Hasan, R. 1978. Text in the systemic-functional model. Current trends in textlinguistics, 2, 229-45.
Johnson, D. W., & Johnson, R. T. 1989. Cooperative learning: What special education teachers need to know. The Pointer, 33(2), 5-11.
Kalyuga, S., Chandler, P., & Sweller, J. 1999. Managing split‐attention and redundancy in multimedia instruction. Applied Cognitive Psychology: The Official Journal of the Society for Applied Research in Memory and Cognition, 13(4), 351-371.
Kester, L., Lehnen, C., Van Gerven, P. W., & Kirschner, P. A. 2006. Just-in-time, schematic supportive information presentation during cognitive skill acquisition. Computers in Human Behavior, 22(1), 93-112.
Krashen, S. D. 1985. The Input Hypothesis: Issues and Implication[M]. London: Longman.
Kress, G. 2001. Sociolinguistics and social semiotics[A]. In Cobley, P.(ed.) The Routledge Companion to Semiotics and Linguistics[C]. London and New York: Routledge.
Larsen-Freeman D. 2005. Teaching Language: From Grammar to Grammaring[M]. Beijing: Foreign Language Teaching and Research Press.
Lee, H., Plass, J. L., & Homer, B. D. 2006. Optimizing cognitive load for learning from computer-based science simulations. Journal of educational psychology, 98(4), 902.
O'Malley, M. J., & Chamot, A. U. 1990. Learning strategies in second language acquisition. Cambridge university press.
Oxford, R. 1990. Language learning strategies. New York, 3.
Paas, F. G., Van Merriënboer, J. J., & Adam, J. J. 1994. Measurement of cognitive load in instructional research. Perceptual and motor skills, 79(1), 419-430.
Pennycook, A. 1996. Borrowing others' words: Text, ownership, memory, and plagiarism. TESOL quarterly, 30(2), 201-230.
Pollock, E., Chandler, P., & Sweller, J. 2002. Assimilating complex information. Learning and instruction, 12(1), 61-86.
Richards J. 2006. Second Language Listening: Theory and Practice[M]. Cambridge: Cambridge University Press.
Royce, T. 2002. Multimodality in the TESOL classroom: Exploring visual‐verbal synergy. TESOL quarterly, 36(2), 191-205.
Rumelhart, D.E. 1980. Schemata: the building blocks of cognition. In: R.J. Spiro etal. (eds) Theoretical Issues in Reading Comprehension[C], Hillsdale, NJ: Lawrence Erlbaum.
Schmidt, R. 1990. The Role of Consciousness in Second Language Learning[J]. Applied Linguistics, 11( 2): 129 -158.
Skehan, P. 1998. Individual Differences in Second Language Learning [M]. London: Edward Arnold.
Stein, P. 2000. Rethinking Resources: Multimodal Pedagogies in the ESL Classroom[J].TESOL Quarterly, (34):333-336.
Swain, M. 1985. Communicative competence; some roles of comprehensible input and comprehensible output in its development [A]. In S. M. Gass & C. G. Madden (eds.). Input in Second Language Acquisition. Rowley [C]. MA: Newbury House.
Swain, M. 1993. The hypothesis: Just speaking and writing aren't enough [J]. The Canadian Modern Language Review 50:158-164.
Swain, M. 1995. Three functions of output in second language learning [A]. In G. Cook. & B. Seidlhofer (eds.). Principle and Practice in Applied Linguistics [C]. Oxford: Oxford University Press.
Sweller, J. 1988. Cognitive load during problem solving: Effects on learning. Cognitive science, 12(2), 257-285.
Tarmizi, R. A., & Sweller, J. 1988. Guidance during mathematical problem solving. Journal of educational psychology, 80(4), 424.
The New London Group. 1996. A pedagogy of multiliteracies: Designing social futures. Harvard educational review, 66(1), 60-93.
Underwood, M. 1990, Teaching Listening[M]. New York: Longman.
Van Merriënboer, J. J., Kirschner, P. A., & Kester, L. 2003. Taking the load off a learner's mind: Instructional design for complex learning. Educational psychologist, 38(1), 5-13.
公开日期:

 2018-11-30    

基于模拟方法的技术写作同源开发教学研究.杨爱萍

链接

题名:

 基于模拟方法的技术写作同源开发教学研究    

姓名:

 杨爱萍    

学号:

 1501210755    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 李博婷    

导师1单位:

 软件与微电子学院    

导师2姓名:

 高志军    

导师2单位:

 软件与微电子学院    

论文答辩日期:

 2018-11-30    

外文题名:

 Research on Single Sourcing Teaching Based on Simulation-based Method    

关键词:

 技术写作 同源开发教学 模拟方法 教学设计    

外文关键词:

 Technical writing Single sourcing teaching Simulation-based teaching method Instructional design    

论文摘要:

技术文档开发需求日益增长,同源开发方法应运而生。作为一种文档开发方法论,同源开发强调了通过内容模块化实现文档系统性复用的重要思想。同源开发是北京大学技术传播教学体系中的重要组成部分,教学目标以掌握其基本原理、思路及流程为核心,帮助学生完成线性文档的拆解与模块化文档的生成。

总结近几年的教学经验,学生在掌握同源开发上仍存在三项突出问题:1) 主题识别困难,识别结果准确度低;2) 写作过程中的技术原理掌握不到位;3) 主题组织条理不清,文档架构混乱。这些问题与同源开发学习内容、教学方式及教学工具有着密不可分的关系。

本文通过调查与访谈进一步探究教学问题的症结,总结学习者及行业需求,调研现有教学工具,创新性地设计了CCMS教学模拟器SuperEasyDITA,填补同源开发教学工具的空白,并基于该工具,对症下药展开教学方法设计,具体采用了:1) 主题逆向拆解匹配、正向分析识别的双向教学模式,解决主题识别困难;2) 写作方式难度渐进、写作过程技术提醒、写作成品切换修改,提升技术原理掌握程度;3) 学生自主构建情境、同伴协作讨论架构,优化文档组织架构思路与方法;4) 引导性反馈与讨论,加深对同源开发原理、思路与流程的整体理解,深化各过程学习要点。

为验证教学方法的有效性,本研究依托北京大学2017、2018级技术传播专业课程选修学生开展了教学实验,其中,实验组采用基于模拟器的教学方法,对照组采用传统教学方法。研究结果表明,教学模拟器SuperEasyDITA满足了文档同源开发基本功能需求,在教学有用性、易用性及创新性上都较传统教学工具有显著优势;基于模拟器的教学方法有助于学生掌握XML相关技术原理,提升写作过程中技术原理的掌握程度;有助于学生识别主题类型,提升主题识别准确度;有助于学生组织主题,改善文档组织架构;在整体教学效果上提升教学效率的同时解决了教学问题,能真正有效帮助学生更好地掌握同源开发知识体系。

外文摘要:

The demand for technical documents is growing, and the document development methodology called single sourcing comes into being. Single sourcing emphasizes the important idea of ​​systematic reuse of documents through content modularization. Single sourcing teaching is an important part of Peking University's technical communication curriculum. The teaching objectives focus on its basic principles, ideas and processes, and help students to complete the disassembly of linear documents and the generation of modular documents.

After summarizing and analysing the teaching experience in recent years, this paper finds that students still have some problems in learning single sourcing: 1) difficulty in identifying the topic type and the recognition accuracy is low; 2) difficulty in understanding the XML related technical knowledge; 3) difficulty in organizing the document structure and the structure is unclear. These problems are inextricably linked to the single sourcing learning content, teaching methods and teaching tools.

This paper explores the crux of above problems through surveys and interviews, summarizes learners and industry needs, analyzes existing teaching tools, and innovatively designs the CCMS teaching simulator called SuperEasyDITA to make up for the lack of single sourcing teaching tool. Based on SuperEasyDITA, this paper designs the teaching method aimed at solving current problem. Specifically, it adopts: 1) topic disassembly and matching from standard modular document, and topic analysis and recognition from liner document, which solves the problem of topic recognition; 2) different writing modes with gradual difficulty, instant technical knowledge reminding during writing process, and observation, switch and modification of final document, which improves the mastery of technical knowledge; 3) document use situation construction and collaborative discussion, which strengthens document organizing ideas and methods; 4) instructive feedback and discussion, which deepens the understanding of overall ideas and processes.

In order to verify the effectiveness of this method, this study relies on the students of the 2017 and 2018 technical communication courses of Peking University to carry out the teaching experiments. Among them, the experimental group adopts the simulator-based teaching method and the control group adopts the traditional teaching method. The research results show that the teaching simulator SuperEasyDITA satisfies the basic functional requirements of single sourcing, and has significant advantages in teaching usefulness, ease of use and innovation compared with traditional teaching tools. The simulator-based teaching method helps students to improve the topic recognition accuracy; helps students to understand the XML and related technical knowledge in the writing process; helps students to organize topics and improve document organization; improves the teaching efficiency and at the same time optimizes the teaching process, solves the teaching problems, and can effectively help students to better master the knowledge of single sourcing.

分类号:

 H08    

论文总页数:

 95    

参考文献总数:

 69    

参考文献列表:
安德森, 皮连生. 学习、教学和评估的分类学[M]. 华东师范大学出版社, 2008.
褚慧玲. 基于学校教学常规考试的试卷命制技术[J]. 考试研究, 2008(4):81-92.
费丽嫚. 情景模拟器的设计与实现[硕士学位论文]. 上海:华东师范大学, 2015.
何克抗, 林君芬, 张文兰. 教学系统设计[M]. 高等教育出版社, 2016.
何克抗. 建构主义的教学模式、教学方法与教学设计[J]. 北京师范大学学报(社会科学版), 1997(5).
胡迎春, 广西壮族自治区教育厅组织编写. 职业教育教学法[M]. 华东师范大学出版社, 2010.
金瑞华, 刘春凤, 罗丹. 高仿真模拟教学中引导性反馈的应用进展[J]. 中国高等医学教育, 2017(5):95-97.
李向东, 卢双盈. 职业教育学新编[M]. 高等教育出版社, 2005.
刘晓瑜. 标准参照考试的若干理论与质量分析方法[J]. 华南师范大学学报(社会科学版), 1996(6):69-74.
李双燕. 2015年中国技术写作发展现状调查报告[C]// 中国科协年会. 2015.
李玮. 情景模拟教学法对管理学教学的启示[J]. 教育探索, 2008(7):63-64.
向梅梅, 刘明贵. 应用型本科高校实践教学研究[M]. 暨南大学出版社, 2011.
余文森. 有效教学的理论和模式[M]. 福建教育出版社, 2011.
张军征. 多媒体教学软件设计原理与方法[M]. 科学出版社, 2007.
张建伟. 基于模拟式教学及其效果研究回顾[J]. 电化教育研究, 2001(7):68-71.
张伟远. 网上学习环境评价模型、指标体系及测评量表的设计与开发[J]. 中国电化教育, 2004(7):29-33.
佐藤正夫. 教学论原理[M]. 人民教育出版社, 1996.
Abel, S. In search of professional-grade content marketing. [EB/OL] (2013-07-29) [2018-04-09].http://www.thecontentwrangler.com/2013/07/29/in-search-of-professional-grade-content-marketing/.
Albers M. Single Sourcing and the Technical Communication Career Path [J]. Technical Communication, 2003, 50(3):335-343.
Ament, K. Single sourcing: Building Modular Documentation [M]. William Andrew, 2002.
Andersen R, Batova T. The Current State of Component Content Management: An Integrative Literature Review [J]. IEEE Transactions on Professional Communication, 2016, 58(3):247-270.
Batova T, Andersen R. A Systematic Literature Review of Changes in Roles/Skills in Component Content Management Environments and Implications for Education [J]. Technical Communication Quarterly, 2017, 26(2).
Batova T, Andersen R, Evia C, et al. Incorporating Component Content Management and Content Strategy into Technical Communication Curricula[C]// Acm International Conference on the Design of Communication. ACM, 2016.
Bellamy L. DITA Best Practices [J]. Addison-Wesley Longman, Amsterdam, 2011.
Benson R, Brack C. Developing the scholarship of teaching: what is the role of e-teaching and learning? [J]. Teaching in Higher Education, 2009, 14(1):71-80.
Bell B S, Kanar A M, Kozlowski S W J. Current issues and future directions in simulation-based training in North America [J]. The International Journal of Human Resource Management, 2008, 19(8):1416-1434.
Carlsen, DD. Use of a Microcomputer Simulation and Conceptual Change Text to Overcome Student Preconceptions about Electric Circuits [J]. Journal of Computer-Based Instruction, 1992, 19(4):105-109.
Carter L. The Implications of Single Sourcing for Writers and Writing [J]. Technical Communication, 2003, 50(3):317-320.
Carrington, N. Teaching students to learn unfamiliar technology [J]. Programmatic Perspectives, 2015, 2(7), 230-250.
Chambers, S. K., Haselhuhn, C., Andre, T., Mayberry, C., Wellington, S., Krafka, A., & Berger, J. The acquisition of a scientific understanding of electricity: Hands-on versus computer simulation experience; conceptual change versus didactic text [J]. In Annual Meeting of the American Educational Research Association, New Orleans, LA, 1994.
Chronister C, Brown D. Comparison of Simulation Debriefing Methods [J]. Clinical Simulation in Nursing, 2012, 8(7):e281-e288.
Costabile M F, Marsico M D, Lanzilotti R, et al. On the Usability Evaluation of E-Learning Applications[C]// Hawaii International Conference on System Sciences. IEEE Computer Society, 2005.
Cooper A., Reimann, R., & Dubberly, H. About Face 2.0: The Essentials of Interaction Design [C]// John Wiley & Sons, Inc. 2007.
Decker S, Fey M, Sideras S, et al. Standards of Best Practice: Simulation Standard VI: The Debriefing Process [J]. Clinical Simulation in Nursing, 2013, 9(6):S26-S29.
Dekkers J, Donatti S, Dekkers J, et al. The Integration of Research Studies on the Use of Simulation as an Instructional Strategy [J]. Journal of Educational Research, 1981, 74(6):424-427.
Dicheva D, Dichev C. Gamification in Education: Where Are We in 2015? [C]//E-Learn: World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education. Association for the Advancement of Computing in Education (AACE), 2015: 1445-1454.
Dreifuerst, K. T. The essentials of debriefing in simulation learning: a concept analysis [J]. Nursing Education Perspectives, 2009, 30(2):109-114.
Doherty S. Leveraging industry onboarding materials in the curriculum[C]// Acm International Conference on the Design of Communication. ACM, 2017.
Dzida W, Freitag R. Making use of scenarios for validating analysis and design [J]. IEEE Transactions on Software Engineering, 2002, 24(12):1182-1196.
Eble M F. Content vs. Product: The Effects of Single Sourcing on the Teaching of Technical Communication [J]. Technical Communication, 2003, 50(3):344-349.
Evans, R. Teaching Single Sourcing To Bridge the Gap between Classrooms and Industry. [EB/OL] (2013-09-05) [2018-04-09]. https://www.writingassist.com/newsroom/teaching-single-sourcing/
Grimes P W, Willey T E. The effectiveness of microcomputer simulations in the principles of economics course [J]. Computers & Education, 1990, 14(1):81-86.
Groom J A, Henderson D, Sittner B J. National League for Nursing Jeffries Simulation Framework State of the Science Project: Simulation Design Characteristics [J]. Clinical Simulation in Nursing, 2014, 10 (7), 337–344.
Hanson AJ, Lindahl P, Strasser SD, Takemura AF, Englund DR. Technical Communication Instruction for Graduate Students: The Communication Lab vs. a Course [J]. 2017 ASEE Annual Conference & Exposition, 27.
Hart-Davidson W. On Writing, Technical Communication, and Information Technology: The Core Competencies of Technical Communication [J]. Technical Communication, 2001, 48(2):145-155.
Henschel, S. M. Authoring content for reuse: A study of methods and strategies, past and present, and current implementation in the technical communication curriculum [D]. 2010, Lubbock, TX: Texas Tech University.
Hovde M R, Renguette C C. Technological Literacy: A Framework for Teaching Technical Communication Software Tools [J]. Technical Communication Quarterly, 2017, 26(2).
Kolb D A, Boyatzis R E, Mainemelis C. Experiential Learning Theory: Previous Research and New Directions [J]. 2001.
Kulik J A. Effects of Computer-Based Teaching on Secondary School Students [J]. Journal of Educational Psychology, 1983, 75(1):19-26.
Lave J, Wenger E. Situated learning: legitimate peripheral participation [J]. 状況に埋め込まれた学習:正統的周辺参加, 1991, 29(2):167-182.
Lee J. Effectiveness of computer-based instructional simulation: A meta-analysis [J]. International Journal of Instructional Media, 1999, 26(1):71-85.
Mariani B, Cantrell M A, Meakim C. Nurse educators' perceptions about structured debriefing in clinical simulation [J]. Nursing Education Perspectives, 2014, 35(5):330-331.
Mcshane, B. J. How to teach xml: a brief tutorial [J]. Intercom, 2007, 54, 20-39.
McDaniel, R., & Steward, S. Technical communication pedagogy and the broadband divide: Academic and industrial perspectives [J]. Complex worlds: Digital culture, rhetoric, and professional communication, 2011, 195-212.
Papert, S. Situating Constructionism [A]. In I. Harel, & S. Papert (Eds.). Constructionism: Research Reports and Essays 1985-1990 [C]. Norwood, N.J.: Ablex Publishing Corporation. 1991, 1-11.
Price, R. M., Denise S P, Joel K A, et al. Observing populations and testing predictions about genetic drift in a computer simulation improves college students’ conceptual understanding [J]. Evolution Education & Outreach, 2016, 9(1):8.
Pruitt, John, Adlin, et al. The Persona Lifecycle [M]. 2006.
Rentroia-Bonito M A, Jorge J A P. An Integrated Courseware Usability Evaluation Method[C]// International Conference on Knowledge-based Intelligent Information. 2003, 2774, 208-214.
Robidoux, Charlotte. Rhetorically Structured Content: Developing a Collaborative Single-Sourcing Curriculum [J]. Technical Communication Quarterly, 2007, 17(1):110-135.
Robidoux, C., & Waychoff, P. CMS solutions: Knowing the right stuff [J]. Best Practices, Center for Information-Management Development, 2005a, 7, 86–89.
Rockley A, Cooper C. Managing Enterprise Content [M]. New Riders, 2012.
Rockley A. The Impact of Single Sourcing and Technology [J]. Technical Communication, 2001, 48(2):189-193.
Rush Hovde, M., Renguette C C. Technological Literacy: A Framework for Teaching Technical Communication Software Tools [J]. Technical Communication Quarterly, 2017, 26(2), 395-411.
Salas E, Wildman J, Piccolo R. Using simulation-based training to enhance management education [J]. Academy of Management Learning & Education, 2009, 8(4):559-573.
Sapienza, F. Does being technical matter? xml, single source, and technical communication [J]. Journal of Technical Writing & Communication, 2002, 32(2), 155-170.
Schertler M. E-Teaching Scenarios [J]. Virtual Technologies Concepts Methodologies Tools & Applications, 2008.
Self T. The DITA Style Guide: Best Practices for Authors [M]. Scriptorium Publishing Services, Incorporated, 2011.
Thomas R, Hooper E. Simulations: An opportunity we are missing [J]. Journal of Research on Computing in Education, 1991, 23:497-513.
Young MF. Instructional design for situated learning [J]. Educational Technology Research and Development, 1993, 41(1):43-58.
公开日期:

 2018-11-30    

2018-06-06

指称理论对于生成语法的必要性.张振宝

链接

题名:

 指称理论对于生成语法的必要性    

姓名:

 张振宝    

学号:

 1401213083    

论文语种:

 eng    

专业:

 文学 - 外国语言文学 - 外国语言学及应用语言学    

公开时间:

 1年后    

培养层次:

 硕士    

学位:

 文学硕士    

培养单位:

 北京大学    

院系:

 外国语学院    

导师1姓名:

 何卫    

导师1单位:

 外国语学院    

论文答辩日期:

 2018-06-06    

外文题名:

 On the Position of Reference Theory in Generative Grammar    

关键词:

 生成语法 指称 满足概念 必要性 完全解释原则    

外文关键词:

 Generative Grammar reference notion of satisfaction necessity FI principle    

论文摘要:

摘要

      索绪尔认为语言是一个结构系统,能指和所指是语言符号互补的两个方面。弗雷格在对意义和指称进行区分的基础上,认为语言符号表达意义,指称个体的人或事物。乔姆斯基在处理语言的语义问题时,曾多次否认指称是人类语言系统的组成部分。但通过对生成语言学发展历程的梳理,本文发现,指称对于该语言学理论具有十分重要的作用。本文首先对指称理论进行讨论,梳理了弗雷格、罗素和斯特劳森三位代表性哲学家关于指称的观点,并结合塔斯基的满足概念,尝试将指称理论与生成语法的句法运算联系起来,继而以此理论关联为切入点,探讨指称对于生成语法的必要性。在早期以范畴为基础的规则系统,即短语结构语法中,句子被改写成由句法范畴构成的结构系统,然后在每个范畴内选取具体的词语,构成实际使用的语言。这样,这种语言生成方法就不会涉及到指称的问题。而在后来的原则-参数理论中,如果不考虑指称,DPIP就无法满足扩展的投射原则,从而会导致句法运算的失败。而在最简方案中,完全解释原则要求语言单位在句法运算的每一步都能得到完全解释,即将每一语言单位在每一步运算所产生的结果都解释为意义和语音的结合体。而在最简方案中,如果不考虑指称,DP以及IP,包括时态、情态动词等,都无法得到完全解释,这样句法运算就会崩溃crash)。基于此,本文得出结论,指称对于生成语言学是十分必要的。

外文摘要:

Abstract

       Saussure considers language as a structured system, with signified and signifier as its two complementary facets. Frege’s theory of reference, based on the distinction of sense and referent, claims that a language sign is to express its sense and to denote its referent. Chomsky in his treatment of semantic problems, repeatedly rejects reference as part of human language system. However, a brief survey of its historical development reveals thatreference relation cannot be neglected, which instead plays a very important role in language computation.This thesis conductsa research on the necessity of reference in Generative Grammar. By surveying the reference theory of Frege, Russell and Strawson, this thesis finds that DP can be defined by more primitive elements, the variables that are undermined, and by assigning truth value to the variable does a DP denote a person or an object in the world. Tarski’s notion of satisfaction defines truth through syntax, which, when connected with the definition of DP, can be used to testify whether reference relation is necessary for Generative Grammar. In the early Category-based Rule System, reference is not involved in language computation. According to this system, a sentence is rewritten as a syntactic structure, which is composed of syntactic categories. Then a word is picked from each category to produce a terminal sentence. In the Principle and Parameter Model, syntactic levels like DP and IP cannot satisfy their respective sentential functions without considering reference, which violates the Extended Projection Principle, and therefore, the language computation cannot move on because projection approach is the basic way of language computation in this model. Then in the Minimalist Program of the Principle and Parameter Model, DP and IP cannot receive their full interpretation without considering reference. Therefore, the syntactic computation will crash for FI Principle is the general property of natural language. Based on these arguments, the thesis concludes that reference is necessary for Generative Grammar.

分类号:

 H04    

论文总页数:

 64    

参考文献总数:

 49    

参考文献列表:
References
Aarsleff, H. 1970. The History of Linguistics and Professor Chomsky. Language. Vol. 46, No. 3: 570-585.
Alsena, A. 1992. On the Argument Structure of Causatives. Linguistic Inquiry. Vol. 23, No. 4: 517-555.
Antony, L. M. & N. Hornstein. 2003. Chomsky and His Critics. Hoboken, New Jersey: The Blackwell Publishing.
Araki, N. 2015. Saussure and Chomsky, Language and I-Language. Bull. Hiroshima Inst. Tech. Research. Vol.49: 1-11.
Barman, B. 2012. The Linguistic Philosophy of Noam Chomsky. Philosophy and Progress. Vol LI-LII, January-June: 104-122.
Berwick, R. C. & N. Chomsky. 2017. Why Only Us, Recent Questions and Answers. Journal of Neurolinguistics. Vol 43, Part B: 166-177.
Black, C. A. A Step-by-step Introduction to the Government and Binding Theory of Syntax. http://www.mexico.sil.org/sites/mexico/files/e002-introgb.pdf.
Boskovic, Z. Principles and Parameters and Minimalism.
http://web2.uconn.edu/boskovic/papers/PrincParam&Minimalism.DikkenRevised2010Final.pdf.
Carnie, A. 2006. Syntax, a Generative Introduction (2nd Edition). Hoboken, New Jersey: The Blackwell Publishing.
Carrier, J. & H. J. Randall. 1992. The Argument Structure and Syntactic Structure of Resultative. Linguistic Inquiry. Vol. 23, No. 2: 173-234.
Chomsky, N.
—1957. Syntactic Structures. Hague: Mouton Publishers.
—1965. Aspects of the Theory of Syntax.
https://faculty.georgetown.edu/irvinem/theory/Chomsky-Aspects-excerpt.pdf.
—1968. Quine’s Empirical Assumptions. Synthese. Vol.19, No 1/2: 53-68.
—1981. Knowledge of Language, Its Elements and Origins. Philosophical Transactions of the Royal Society. Vol. 295, Series B: 223-234.
—1982. A Note on the Creative Aspect of Language. The Philosophical Review. Vol. 91, No. 3: 423-434.
—1984. Noam Chomsky Writes to Mrs. Davis about Grammar and Education. English Education. Vol. 16, No. 3: 165-166.
—1986. Knowledge of Language, Its Nature, Origin, and Use. New York, London: Paegen Special Studies.
—1992. Explaining Language Use. Philosophical Topics: 205-231.
—1994. Models, Nature and Language. Grand Street: 170-176.
—1995. Language and Nature. Mind, New Series: Vol. 104, No. 413: 1-61.
—1995. The Minimalist Program. Cambridge, MA: The MIT Press.
—1997. Language and Problems of Knowledge. Teorema: Revista Inernacional de Filosofia. Vol. 16, No. 2: 5-33.
—2000. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press.
—2006. Language and Mind (3rd Edition). Cambridge: Cambridge University Press.
—2013. Problems of Projection. Lingua 130: 33-49.
Chomsky, N., A. J. Gallego & D. Ott. Generative Grammar and the Faculty of Language: Insights, Questions and Challenges. https://www.google.com.hk/url.
Chomsky, N. & J. J. Katz. 1971. What the Linguist Is Talking About. The Journal of Philosophy. Vol. 71, No. 12: 347-367.
Emonds, J. E. 1991. Subcategorization and Syntax-Based Theta-role Assignment. Natural Language & Linguistic Theory. Vol. 9, Issue. 3: 369-429.
Frege, G. 1948. Sense and Reference. The Philosophical Review. Vol. 57, No. 3: 209-230.
Freidin. R. 2007. Generative Grammar, Theory and Its History. London and New York: Routledge Taylor & Francis Group.
Haegeman, L. 1997. Elements of Grammar, a Handbook of Generative Syntax. Springer: Springer Science + Business Media Dordrecht.
Hauser, M. D., N. Chomsky & W. T. Fitch. 2002. The Faculty of Language, What Is It, Who Has it, and How Did It Evolve? Science. Vol. 298, Issue 5598: 1569-1579.
Heim, I. &A. Kratzer. 1998. Semantics in Generative Grammar. Oxford: Blackwell Publisher.
Jackendoff, R. Reexamining the Foundations of Generative Grammar.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.360.1856&rep=rep1&type=pdf.
Katz, J. J. 1980. Chomsky on Meaning. Language: 1-41.
Lasnik, H. 2002. The Minimalist Program in Syntax. Trends in Cognitive Sciences. Vol. 6, Issue. 10: 432-437.
Lidz, J. & L. Gleitman. 2014. Yes, We Still Need Universal Grammar. Cognition 94: 85-93.
Lopez, B. G. 2001. Argument Structure, Thematic Roles and Linking. Atlantis. Vol. 23, No.2: 49-64.
Ludlow, P. 2011. The Philosophy of Generative Linguistics. Oxford: The Oxford Press.
Putnam, L. R. & N. Chomsky 1994-1995. An Interview with Noam Chomsky. Reading Teacher: 328-333.
Roberts, I. 2016. The Oxford Handbook of Universal Grammar. Oxford: Oxford University Press.
Runner, J. T. 2002. When Minimalism Isn’t Enough, an Argument for Argument Structure. Linguistic Inquiry. Vol. 33, No. 1: 172-182.
Russell, B. 1905. On Denoting. Mind, New Series. Vol. 14, No. 56: 479-493.
Russell, B. 2010. The Principles of Mathematics. London and New York: Routledge.
Saussure, F. de. 2001. Course in General Linguistics. Beijing: Foreign Language Teaching and Research Press.
Stainton, R. J. Meaning and Reference—Some Chomskian Themes. http://publish.uwo.ca.
Lepore, E. & B. C. Smith. 1976. Handbook of Philosophy of Language. Oxford: Oxford University Press.
Strawson, P. F. 1950. On Referring, Mind. Vol. 59, No. 235: 320-344.
Tarski, A. Concept of Truth in Formalized Languages. http://www.thatmarcusfamily.org.
公开日期:

 2019-06-06    

2018-05-27

英汉翻译中的变通与忠实.张英杰

链接

题名:

 英汉翻译中的变通与忠实    

姓名:

 张英杰    

学号:

 1601213263    

论文语种:

 chi    

专业:

 专业学 - 翻译硕士 - 英语笔译    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 翻译硕士专业学位    

培养单位:

 北京大学    

院系:

 外国语学院    

导师1姓名:

 朱源    

导师1单位:

 中国人民大学外国语学院    

论文答辩日期:

 2018-05-27    

外文题名:

 Flexibility and Fidelity in E-C Translation    

关键词:

 英汉翻译 变通 忠实    

外文关键词:

 E-C translation flexibility fidelity    

论文摘要:

在翻译实践中, 译者发现, 除了理解原文外, 翻译的主要工作在于克服语言差异——同一个意思, 英语里这样说, 汉语里则要换一种说法才能理解。 语言间的差异永远存在, 怎样变换“说法” 以达意就成了翻译的永恒课题。 当然, 翻译并不总是在变, 本翻译报告提出的论点是: 变通和忠实是翻译中的两大原则。首先, 作为从一种语言到另一种语言的转换, 翻译总体上是一个语言上的归化过程, 势必涉及到语言上的变通, 才能调和两种语言在语法、 表达习惯等方面的差异, 达到翻译的主要目的: 传达意义; 除语言变通之外, 翻译自然应有不“变”之处, 即应忠实于原作的地方, 本报告将“忠实” 这一概念的内涵界定为意义、语言风格、 术语三方面的忠实。 为阐述这一论点, 报告针对《消费时代的迷思》一书的语言特点举例探讨了多种变通策略, 如抽象名词的翻译、 插入语的处理、
长从句的处理、 增加逻辑关联词等等, 也举例说明了译者如何达到术语翻译的忠实, 以及忠实传达出原文语言风格的具体方法。
 

分类号:

 H059    

论文总页数:

 276    

参考文献总数:

 14    

参考文献列表:
冯世则:意译、直译、逐字译,载《中国翻译》,1981年第二期,7-10页。
辜正坤:翻译标准多元互补论,载《中国翻译》,1989第一期,100-105页。
黄河清,毛荣贵:科技翻译使用括号举隅,载《上海翻译》,1988年第四期,23-24页。
姜望琪:论术语翻译的标准,载《上海翻译》,2005第一期,80-84页。
黎运汉:1949年以来语言风格定义研究评述,载《语言文字应用》,2002第一期,100-106 页。
孙周兴:学术翻译的几个原则——以海德格尔著作之汉译为例证,载《中国翻译》,2013 第四期,70-73页。
王克非:近代翻译对汉语的影响,载《外语教学与研究》,2002年第六期,458-463页。
王力:《中国现代语法》。上海:商务印书馆,1943。
王文华:动静之间,载《中国翻译》,2001年第二期,44-47页。
解献芬:试论中西“语言风格”的定义,载《清华大学学报(哲学社会科学版)》2004第一 期,55-59。
许余龙:《对比语言学》。上海:上海外语教育出版社,2002。
余光中:论“的的不休”,载《余光中谈翻译》, 北京:中国对外翻译出版公司,2002。
Li, C. N. & Thompson, S. A. "Subject and Topic: A new typology of language." Contemporary Linguistics (1984).
Tytler, Alexander Fraser. Essay on the Principles of Translation (1813): New edition. Vol. 13. John Benjamins Publishing, 1978.
馆藏号:

 039/M2018(103)    

公开日期:

 2018-05-27    

2018-05-26

基于深度学习的文本语句扩展系统的设计与实现.于昌和

链接

题名:

 基于深度学习的文本语句扩展系统的设计与实现    

作者:

 于昌和    

学号:

 1501210770    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 俞敬松    

导师单位:

 软件与微电子学院    

答辩日期:

 2018-05-26    

关键字(中文):

 深度学习 文本扩展 Seq2Seq    

文摘:

~英语是全球使用人数最多的语言之一,在国内,越来越多的人开始学习英语,英语文本写作成为最重要的英语学习内容之一。英语文本语句扩展来源于英语教学写作中的写长法,就是让学生学习各种技巧,使用各种方法,不断地把句子写长,锻炼英语学习和产出能力。那么在实际的语言服务中,写作是很重要的一个环节,这时候不仅仅需要把句子写长,还要把句子写得丰富,写得通畅。本文以句子写长扩展入手,探索使用各种深度学习技术来帮助写作者,完善他们写作的各种技巧和策略。seq2seq在机器翻译中取得了很好的效果,本文在seq2seq中使用不同的编码器和解码器,从三个方面进行文本语句扩展:
1.本文对文本词汇扩展进行研究。一些英文写作的人,写出的句子总是干巴巴的,句子不丰满,不会使用形容词和副词。为了使句子表达的更加丰富和通畅,本文设计了形容词和副词扩展模块,完成对文本句子形容词和副词的补充和扩展。
2.本文对句子续写进行研究。一部分英语学习者,在英文写作中,有时会出现写出上句,不知道下面该怎么写的情况,针对这种情况,本文设计了句子续写模块。
3.本文对句子生成进行研究。探索只使用名词和动词来生成文本句子,因为一些英语写作的人,有时候只会想到几个动词和名词,无法把这些词组织成一句完整和通顺的句子,针对这种情况,本文设计了句子生成模块。
本文的词汇扩展模块,可以很好地帮助写作者完成形容词和副词的补充,在训练集上可以完全恢复63%的形容词和副词。在900条测试语句中,模型扩展的句子在语言模型上的得分只比目标句子的得分少0.14。在上下句续写任务中,模型可以一定程度上给写作者提供思路;在基于动词和名词完成句子自动生成的任务上,取了一定的进展,BLEU值达到了27.23,单词覆盖率达到63%。
 

分类号:

 TP3    

论文总页数:

 60    

参考文献数:

 38    

参考文献:
[1] Mccoy K F. Simple NLP Techniques for Expanding Telegraphic Sentences[J]. Sentences Natural Language Processing for Communication Aids,1997, 2007.
[2] Artificial Neural Networks[J]. Encyclopedia of Microfluidics & Nanofluidics:23-33.
[3] Rosenblatt F. The perception: a probabilistic model for information storage and organization in the brain[M] Neurocomputing: foundations of research. MIT Press, 1988:386-408.
[4] Jeffrey L. Elman. Finding Structure in Time[J]. Cognitive Science,1990, 14(2):179 -211.
[5] Gregor K, Danihelka I, Graves A, et al. DRAW: a recurrent neural network for image generation[J]. Computer Science, 2015:1462-1471.
[6] Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network based language model[C]// INTERSPEECH 2010, Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September. DBLP, 2010:1045-1048.
[7] Li L, Jin L, Jiang Z, et al. Biomedical named entity recognition based on extended Recurrent Neural Networks[C]// IEEE International Conference on Bioinformatics and Biomedicine. IEEE, 2015:649-652.
[8] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J]. Computer Science, 2014.
[9] Fan E G. Extended Tanh-function Method and its Applications to Nonlinear Equations[J]. Physics Letters A, 2000, 277(4):212-218.
[10] Hecht-Nielsen R. Theory of the backpropagation neural network[M].Neural networks for perception (Vol. 2). Harcourt Brace & Co. 1992:593-605 vol.1.
[11] Schmidhuber J, rgen. Deep learning in neural networks[M]. Elsevier Science Ltd. 2015.
[12] Schuster M, Paliwal K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 2002, 45(11):2673-2681.
[13] Hochreiter S. LSTM can solve hard long time lag problems[C]// International Conference on Neural Information Processing Systems. MIT Press, 1996:473-479.
[14] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[15] Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Netw, 2005, 18(5):602-610.
[16] Gers F A, Schraudolph N N. Learning precise timing with lstm recurrent networks[M]. JMLR.org, 2003.
[17] Gers F A, Schmidhuber J, Cummins F. Learning to forget: continual prediction with LSTM[J]. Neural Computation, 2000, 12(10):2451-2471.
[18] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J]. Computer Science, 2014.
[19] Fukushima K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position[J]. Biological Cybernetics, 1980, 36(4):193-202.
[20] Lecun Y, Boser B, Denker J S, et al. Backpropagation Applied to Handwritten Zip Code Recognition[J]. Neural Computation, 2014, 1(4):541-551.
[21] Yin W, Schütze H, Xiang B, et al. ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs[J]. Computer Science, 2015.
[22] Wang L, Cao Z, Melo G D, et al. Relation Classification via Multi-Level Attention CNNs[C]// Meeting of the Association for Computational Linguistics. 2016:1298-1307.
[23] Zhu J, Qiao J, Dai X, et al. Relation Classification via Target-Concentrated Attention CNNs[J]. 2017:137-146.
[24] Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model.[M] Innovations in Machine Learning. Springer Berlin Heidelberg, 2006:137-186.
[25] Bojanowski P, Grave E, Joulin A, et al. Enriching Word Vectors with Subword Information[J]. 2016.
[26] Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation[C]// Conference on Empirical Methods in Natural Language Processing. 2014:1532-1543.
[27] Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[J]. 2014, 4:3104-3112.
[28] Jaitly N, Sussillo D, Le Q V, et al. A Neural Transducer[J]. Computer Science, 2016.
[29] Vinyals O, Le Q. A Neural Conversational Model[J]. Computer Science, 2015.
[30] Jean S, Cho K, Memisevic R, et al. On Using Very Large Target Vocabulary for Neural Machine Translation[J]. Computer Science, 2015.
[31] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[J]. Computer Science, 2014.
[32] Jaitly N, Sussillo D, Le Q V, et al. A Neural Transducer[J]. Computer Science, 2016.
[33] Britz D, Goldie A, Luong M T, et al. Massive Exploration of Neural Machine Translation Architectures[J]. 2017.
[34] Papineni S. Blue ; A method for Automatic Evaluation of Machine Translation[C]// Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2002.
[35] Gehring J, Auli M, Grangier D, et al. Convolutional Sequence to Sequence Learning[J]. 2017.
[36] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1):1929-1958.
[37] Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. Computer Science, 2012, 3(4): 212-223.
[38] Kingma D, Ba J. Adam: A Method for Stochastic Optimization[J]. Computer Science, 2014.
馆藏号:

 017/M2018(311)    

公开日期:

 2021-05-26    

基于多人在线战术竞技游戏的虚拟团队数据分析与研究.曾伊蕾

链接

题名:

 基于多人在线战术竞技游戏的虚拟团队数据分析与研究    

姓名:

 曾伊蕾    

学号:

 1401210506    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 1年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 俞敬松    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2018-05-26    

关键词:

 计算社会科学 个人表现研究 虚拟团队的网络科学    

论文摘要:

计算社会科学(CSS)是计算机技术和社会科学的交叉学科,本研究是该学科对个体行为和团体行为研究的具体实例,受该领域基金支持。本文研究的目的是量化,追踪和预测人类个体和团队行为表现在游戏化虚拟团队环境中常见和异常的行为轨迹,旨在帮助虚拟和现实团队提升整体表现,为在线游戏化平台的个性化激励提供支持。

本文研究的创新性和重要性体现于,对个体行为表现研究中存在的四类问题的针对性解决:个体行为通常以一刀切的方式建模,本文分别从角色,经验,技能,团队网络结构等层面对个体进行多角度的个性化建模;个体行为模型通常不包含时间动态信息,本文所有的模型都考量了人类行为轨迹随时间的动态演变;个体行为模型通常忽略了社会网络效应,本文第五章着重于研究不同网络结构所带来的影响;个体行为模型通常不具概括性,可重复性,可测试性和可解释性。本文方法都是可解释可重复的,实验结果证明本研究的结论具有跨游戏平台的普遍性。

论文首先构建了个人表现随时间的动态演变模型,该模型分析了多人在线战术竞技游戏(MOBA)英雄联盟的玩家数据。通过针对长期行为的回归分析和短期行为的游戏块分析,用数据事实揭示出与一般直觉不同的结论,即短期游戏块内个人行为呈现恶化效应,个人表现提升和长期经验无直接联系,但经验可缓解个人的短期表现恶化。论文使用机器学习算法搭建了能准确预测出玩家何时选择继续或结束当前游戏块的嵌套模型,揭示了决定去留的关键因素。之后论文在该时间模型的基础上构建了个人表现随角色选择的动态演变模型,该模型使用的是MOBA游戏刀塔2的玩家数据。论文通过统计分析定义了不同角色,结果显示出跨角色的个人短期行为热身现象。该模型分别将个体按经验,技能和角色等进行了个性化分类,实验结果揭示了个体玩家成功的模式。最后论文在时间模型的基础上进一步针对网络结构对个体和团队表现所产生的影响进行了建模,该模型不仅使用了MOBA游戏的海量数据还结合了玩家真实朋友关系数据。本文对团队网络结构进行了细分,并应用网络科学,经济学原理和数理统计对随时间动态演变的个体和团队行为表现进行了分析,结果表明低能力团队会因组成网络结构的玩家产生正外部性,从而能提升团队内个体和团队整体的行为和表现。高水平团队需要有意识的让低水平个体和高水平个体搭配,将负外部性内部化来帮助提升团队和个体表现。本文实验结果还显示,密切的团队内部联系能够帮助缓解短期表现恶化效应。虽然本文是关于特定领域的研究,但是所得出的理论结果,建立的动态模型以及使用的分析方法均可应用到更抽象,描述和解释人类行为的上下文中。

分类号:

 TP3    

论文总页数:

 127    

参考文献总数:

 72    

参考文献列表:
[1] Ajzen I. The theory of planned behavior, organizational behavior and human decision processes.[J]. Journal of Leisure Research, 1991, 50(2):176-211.
[2] Hamari J, Koivisto J, Sarsa H. Does Gamification Work? -- A Literature Review of Empirical Studies on Gamification.[C] Hawaii International Conference on System Sciences. IEEE, 2014:3025-3034.
[3] Farzan R, Dimicco J M, Millen D R, et al. Results from deploying a participation incentive mechanism within the enterprise.[C] Sigchi Conference on Human Factors in Computing Systems. ACM, 2008:563-572.
[4] Hey T. The Fourth Paradigm – Data-Intensive Scientific Discovery.[J]. Proceedings of the IEEE, 2011, 99(8):1334-1337.
[5] Lazer D, Pentland A, Adamic L, et al. Life in the network: the coming age of computational social science.[J]. Science, 2016, 323(5915):721-723.
[6] Conte R, Gilbert N, Bonelli G, et al. Manifesto of computational social science.[J]. European Physical Journal Special Topics, 2012, 214(1):325-346.
[7] Lazer D, Pentland A, Adamic L, et al. Social science. Computational social science.[J]. Science, 2009, 323(5915):721-3.
[8] Centola D. The Spread of Behavior in an Online Social Network Experiment.[J]. Science, 2010, 329(5996):1194-1197.
[9] Calvó-Armengol A, Jackson M O. Like Father, Like Son: Social Network Externalities and Parent-Child Correlation in Behavior.[J]. American Economic Journal Microeconomics, 2009, 1(1):124-150.
[10] Lewis K, Gonzalez M, Kaufman J. Social selection and peer influence in an online social network.[J]. Proceedings of the National Academy of Sciences of the United States of America, 2012, 109(1):68-72.
[11] Chudoba K M, Wynn E, Lu M, et al. How virtual are we? Measuring virtuality and understanding its impact in a global organization.[J]. Information Systems Journal, 2005, 15(4):279–306.
[12] Townsend A M, Hendrickson A R. Virtual Teams: Technology and the Workplace of the Future.[J]. Academy of Management Executive, 1998, 12(3):17-29.
[13] Richard H J, Nancy K. Group Behavior and Performance.[M]// Handbook of Social Psychology. 2010:1258-63.
[14] Hertel G, Niedner S, Herrmann S. Motivation of software developers in Open Source projects: an Internet-based survey of contributors to the Linux kernel.[J]. Research Policy, 2003, 32(7):1159-1177.
[15] Clark J, Leavitt A, Williams D. Online Games, Community Aspects of.[M] The International Encyclopedia of Digital Communication and Society. John Wiley & Sons, Inc. 2015.
[16] Huang Y, Ye W, Bennett N, et al. Functional or social?:exploring teams in online games.[C] Conference on Computer Supported Cooperative Work. 2013:399-408.
[17] Ducheneaut N, Moore R J. The social side of gaming: a study of interaction patterns in a massively multiplayer online game.[C] ACM Conference on Computer Supported Cooperative Work. ACM, 2004:360-369.
[18] Shen C. Network patterns and social architecture in Massively Multiplayer Online Games: Mapping the social world of EverQuest II.[J]. New Media & Society, 2014, 16(4):672-691.
[19] Assmann J J, Drescher M A, Gallenkamp J V, et al. MMOGs as Emerging Opportunities for Research on Virtual Organizations and Teams.[C] Americas Conference on Information Systems, Amcis 2010, "sustainable It Collaboration Around the Globe.", Lima, Peru, August. DBLP, 2010:335.
[20] Goh S, Wasko M. The effects of leader-member exchange on member performance in virtual world teams.[J]. Journal of the Association for Information Systems, 2012, 13(10):861-885.
[21] Nardi B, Harris J. Strangers and Friends: Collaborative Play in World of Warcraft.[C] ACM Conference on Computer Supported Cooperative Work, CSCW 2006, Banff, Alberta, Canada, November. DBLP, 2006:149-158.
[22] Kou Y, Gui X. Playing with strangers: understanding temporary teams in league of legends.[C] ACM Sigchi Symposium on Computer-Human Interaction in Play. ACM, 2014:161-169.
[23] Park K, Cha M, Kwak H, et al. Achievement and Friends: Key Factors of Player Retention Vary Across Player Levels in Online Multiplayer Games[C]// International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 2017:445-453.
[24] Bardzell S, Bardzell J, Pace T, et al. Blissfully productive: grouping and cooperation in world of warcraft instance runs.[C] ACM Conference on Computer Supported Cooperative Work. ACM, 2008:357-360.
[25] Tyack A, Wyeth P, Johnson D. The Appeal of MOBA Games: What Makes People Start, Stay, and Stop[C]// Symposium on Computer-Human Interaction in Play. ACM, 2016:313-325.
[26] Benefield G A, Shen C, Leavitt A. Virtual Team Networks: How Group Social Capital Affects Team Success in a Massively Multiplayer Online Game.[C] ACM Conference on Computer-Supported Cooperative Work & Social Computing. ACM, 2016:679-690.
[27] Kim J, Keegan B C, Park S, et al. The Proficiency-Congruency Dilemma: Virtual Team Design and Performance in Multiplayer Online Games.[J]. Computer Science, 2015:4351-4365.
[28] Leavitt A, Keegan B C, Clark J. Ping to Win?: Non-Verbal Communication and Team Performance in Competitive Online Multiplayer Games.[C] CHI Conference on Human Factors in Computing Systems. ACM, 2016:4337-4350.
[29] Kim Y J, Engel D, Mcarthur N, et al. What Makes a Strong Team?: Using Collective Intelligence to Predict Team Performance in League of Legends.[C] ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 2017:2316-2329.
[30] Huang J, Zimmermann T, Nagapan N, et al. Mastering the art of war:how patterns of gameplay influence skill in Halo.[C] Sigchi Conference on Human Factors in Computing Systems. 2013:695-704.
[31] Vicencio-Moreira R, Mandryk R L, Gutwin C. Now You Can Compete With Anyone: Balancing Players of Different Skill Levels in a First-Person Shooter Game.[C] ACM Conference on Human Factors in Computing Systems. ACM, 2015:2255-2264.
[32] Sievertsen H H, Gino F, Piovesan M. Cognitive fatigue influences students’ performance on standardized tests.[J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 113(10):2621.
[33] Borghini G, Astolfi L, Vecchiato G, et al. Measuring neurophysiological signals in aircraft pilots and car drivers for the assessment of mental workload, fatigue and drowsiness.[J]. Neuroscience & Biobehavioral Reviews, 2014, 44:58-75.
[34] Muraven M, Baumeister R F. Self-regulation and depletion of limited resources: does self-control resemble a muscle?[J]. Psychological Bulletin, 2000, 126(2):247-59.
[35] Kooti F, Moro E, Lerman K. Twitter Session Analytics: Profiling Users’ Short-Term Behavioral Changes.[M] Social Informatics. Springer International Publishing, 2016:71-86.
[36] Singer P, Ferrara E, Kooti F, et al. Evidence of Online Performance Deterioration in User Sessions on Reddit[J]. Plos One, 2016, 11(8):e0161636.
[37] Scerbo M W. Stress, Workload and Boredom in Vigilance: A Problem and an Answer.[J]. Stress Workload & Fatigue, 2001.
[38] Warm J S, Matthews G, Finomore V S Jr. Vigilance, workload, and stress.[J]. Performance under stress, 2008:115-41.
[39] Boksem M A, Tops M. Mental fatigue: costs and benefits.[J]. Brain Research Reviews, 2008, 59(1):125-139.
[40] Marcora S M, Staiano W, Manning V. Mental fatigue impairs physical performance in humans.[J]. Journal of Applied Physiology, 2009, 106(3):857-64.
[41] Lim J, Wu W C, Wang J, et al. Imaging brain fatigue from sustained mental workload: an ASL perfusion study of the time-on-task effect[J]. Neuroimage, 2010, 49(4):3426-3435.
[42] Pattyn N, Neyt X, Henderickx D, et al. Psychophysiological investigation of vigilance decrement: boredom or cognitive fatigue?[J]. Physiology & Behavior, 2008, 93(1-2):369.
[43] Lorist M M, Boksem M A S, Ridderinkhof K R. Impaired cognitive control and reduced cingulate activity during mental fatigue.[J]. Brain Research Cognitive Brain Research, 2005, 24(2):199.
[44] Boksem M A, Meijman T F, Lorist M M. Effects of mental fatigue on attention: an ERP study[J]. Brain Res Cogn Brain Res, 2005, 25(1):107-116.
[45] Boksem M A, Meijman T F, Lorist M M. Mental fatigue, motivation and action monitoring.[J]. Biological Psychology, 2006, 72(2):123-132.
[46] Demerouti E, Bakker A B, Nachreiner F, et al. The job demands-resources model of burnout.[J]. J Appl Psychol, 2001, 86(3):499-512.
[47] G. Robert J. Hockey, A. John Maule, Peter J. Clough, et al. Effects of negative mood states on risk in everyday decision making.[J]. Cognition & Emotion, 2000, 14(6):823-855.
[48] Sanders A F. Elements of human performance:, Reaction processes and attention in human skill.[M] Elements of Human Performance: Reaction Processes and Attention in Human Skill. Lawrence Erlbaum Associates, 1998:231-234.
[49] Van d L D, Frese M, Meijman T F. Mental fatigue and the control of cognitive processes: effects on perseveration and planning.[J]. Acta Psychologica, 2003, 113(1):45.
[50] Danziger S, Levav J, Avnaimpesso L. Extraneous factors in judicial decisions.[J]. Proceedings of the National Academy of Sciences of the United States of America, 2011, 108(17):6889.
[51] Vohs K D, Baumeister R F, Schmeichel B J, et al. Making choices impairs subsequent self-control: a limited-resource account of decision making, self-regulation, and active initiative.[J]. Journal of Personality & Social Psychology, 2008, 94(5):883-98.
[52] Mullettegillman O A, Leong R L, Kurnianingsih Y A. Cognitive Fatigue Destabilizes Economic Decision Making Preferences and Strategies.[J]. 2015, 10(7).
[53] Page S E. The Difference:How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies (New Edition).[M]. Princeton University Press, 2008.
[54] Jia P, Mirtabatabaei A, Friedkin N E, et al. Opinion Dynamics and the Evolution of Social Power in Influence Networks.[J]. Siam Review, 2013, 57(3):367-397.
[55] Woolley A W, Chabris C F, Pentland A, et al. Evidence for a collective intelligence factor in the performance of human groups.[J]. Science, 2010, 330(6004):686-688.
[56] Ferrara E, Alipourfard N, Burghardt K, et al. Dynamics of Content Quality in Collaborative Knowledge Production.[J]. 2017.
[57] Halfaker A, Keyes O, Kluver D, et al. User Session Identification Based on Strong Regularities in Inter-activity Time.[C] International World Wide Web Conferences Steering Committee, 2015:410-418.
[58] Ho T K. The Random Subspace Method for Constructing Decision Forests[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1998, 20(8):832-844.
[59] Friedman J, Hastie T, Tibshirani R. The Elements of Statistical Learning.[J]. Journal of the Royal Statistical Society, 2001, 167(1):267-268.
[60] Friedman J H. Greedy Function Approximation: A Gradient Boosting Machine.[J]. Annals of Statistics, 2001, 29(5):1189-1232.
[61] Freund Y, Schapire R, Abe N. A short introduction to boosting.[J]. Journal-Japanese Society For Artificial Intelligence, 1999, 14:771-780.
[62] Schapire R E, Singer Y. Improved Boosting Algorithms Using Confidence-rated Predictions.[J]. Machine Learning, 1999, 37(3):297-336.
[63] Radicchi F, Fortunato S, Markines B, et al. Diffusion of scientific credits and the ranking of scientists.[J]. Physical Review E Statistical Nonlinear & Soft Matter Physics, 2009, 80(2):056103.
[64] Sinatra R, Wang D, Deville P, et al. Quantifying the evolution of individual scientific impact.[J]. Science, 2016, 354(6312):aaf5239-aaf5239.
[65] Rodi G C, Loreto V, Servedio V D P, et al. Optimal Learning Paths in Information Networks.[J]. Scientific Reports, 2015, 5:10286.
[66] Memmert D, Lemmink K A, Sampaio J. Current Approaches to Tactical Performance Analyses in Soccer Using Position Data.[J]. Sports Medicine, 2016:1-10.
[67] Cha M, Haddadi H, Benevenuto F, et al. Measuring User Influence in Twitter: The Million Follower Fallacy.[C] International Conference on Weblogs and Social Media, Icwsm 2010, Washington, Dc, Usa, May. DBLP, 2010.
[68] Hong L, Dan O, Davison B D. Predicting popular messages in Twitter.[C] International Conference on World Wide Web, WWW 2011, Hyderabad, India, March 28 - April. DBLP, 2011:57-58.
[69] Movshovitz-Attias D, Movshovitz-Attias Y, Steenkiste P, et al. Analysis of the reputation system and user contributions on a question answering website: StackOverflow.[C] Ieee/acm International Conference on Advances in Social Networks Analysis and Mining. ACM, 2013:886-893.
[70] Pobiedina N, Neidhardt J, Moreno M D C C, et al. On Successful Team Formation: Statistical Analysis of a Multiplayer Online Game.[C] Business Informatics. IEEE, 2013:55-62.
[71] Becker R, Chernihov Y, Shavitt Y, et al. An analysis of the Steam community network evolution.[C]// Electrical & Electronics Engineers in Israel. IEEE, 2012:1-5.
[72] Blackburn J, Kourtellis N, Skvoretz J, et al. Cheating in Online Games: A Social Network Perspective.[J]. Acm Transactions on Internet Technology, 2014, 13(3):9.
馆藏号:

 017/M2018(336)    

公开日期:

 2019-05-26    

基于神经网络的影视剧向量表示模型.隋春宁

链接

题名:

 基于神经网络的影视剧向量表示模型    

作者:

 隋春宁    

学号:

 1501210674    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 俞敬松    

导师单位:

  软件与微电子学院    

答辩日期:

 2018-05-26    

题目(外文):

 A Video Content Embedding Model Using Neural Networks    

关键字(中文):

 分布式表示 神经网络 视频内容表示    

关键字(外文):

 Distributed representation Neural networks Video content embedding    

文摘:

随着视频网站不断发展,影视剧数据和用户数量都大幅上升,对影视剧的自动分类、推荐等任务产生了大量需求。传统上,视频网站的分类信息往往来源于人工编辑,推荐系统则主要依据用户行为数据和协同过滤算法。由于标注人力有限和数据稀疏问题,人工分类的可扩展性是一大瓶颈,冷门影视剧或者新用户的推荐结果也存在局限。

本论文采用神经网络,对影视剧的标签、剧情梗概等不同来源的异质文本数据进行降维和整合,将原始文本数据映射到语义空间中,得到基于内容的低维向量表示。这种分布式的向量表示模型在深度学习中称为嵌入模型,近年来在自然语言处理领域受到广泛关注和研究,并在诸多任务上取得突破进展。

本文首先研究了不同粒度的文本数据的建模方式,综述了单词、短语、句子、段落级别的分布式语义表示模型的概念和方法,并探讨如何将其应用于影视剧场景下。其次,本文基于神经网络,建立了影视剧内容的向量表示模型,通过改进的负采样训练策略,将不同粒度、不同来源的文本元数据融合为一致语义空间下的向量表示。研究表明,使用神经网络的分布式向量表示模型,能够对现有影视剧的内容进行有效的建模,并可以应用于新增加的影视剧数据。该模型可以应用于自动推荐、聚类等任务。

文摘(外文):

With the continuous advancement of online video providers, the number of movies and television series online has risen significantly, alongside with the amount of user data. A great demand has arisen for such tasks as the automatic classification and recommendation of such video contents. Traditionally, the classification information of video sites often comes from manual editors, while recommendation systems mainly rely on user behavioral data and collaborative filtering algorithms. Due to the limited man-hours of labeling and the problem of data sparseness, the scalability of manual classification is a big bottleneck; the recommended results for unpopular movies or new users are also limited.

In this dissertation, neural networks are employed in the dimensionality reduction and integration of heterogeneous text data from different sources, such as labels and synopsis of movies and television series. Through mapping from raw texts to the semantic space, we get low-dimensional vector representations based on their contents. This distributed vector representation model is called an embedded model in deep learning. In recent years, it has received extensive attention and research in the field of natural language processing, and has made breakthroughs in many tasks.

Firstly, this dissertation studies how to model text data with different granularities, reviews the concepts and methods of distributed semantic representation models of words, phrases, sentences and paragraphs, and discusses how to apply these models in the context of movies and television series. Secondly, a vector representation model of video contents is established using neural networks. Text metadata of different granularity and from different sources are merged and mapped into a vector representation in a consistent semantic space via an improved negative sampling training strategy. The study shows that the distributed vector representation model of neural networks can effectively model the content of the existing movies and television series, and can be easily applied to newly-added content data. The model can be applied to automatic recommendation, clustering and other tasks.

分类号:

 TP3    

论文总页数:

 61    

参考文献数:

 60    

参考文献:
Andrew G, Arora R, Bilmes J A, et al. 2013. Deep canonical correlation analysis. ICML, 1247-1255.
Bach F R, Jordan M I. 2002. Kernel independent component analysis. Journal of Machine Learning Research, Issue 3, 1-48.
Baroni M, Dinu G, Kruszewski G. 2016. Don't count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. Proc. ACL, Volume 1, 238–247.
Bengio Y, Ducharme R, Vincent P, et al. 2003. A neural probabilistic language model. Journal of Machine Learning Research, Issue 3, 1137-1155.
Blei D. M., Ng A. Y., Jordan M. I. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, Issue 3, 993–1022.
Bojanowski P., Grave E., Joulin A, Mikolov, T., 2017. Enriching word vectors with subword information. arXiv: 1607.04606.
Boureau Y-L, Ponce J, LeCun Y. 2010. A theoretical analysis of feature pooling in visual recognition. ICML, 111-118.
Bullinaria J A, Levy J P. 2012. Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD. Behavior research methods , 44(3), 890-907.
Chandar S, Lauly S, Larochelle H, et al. 2014. An autoencoder approach to learning bilingual word representations. Proceedings of NIPS 2014.
Cho K, van Merrienboer B, Gulcehre C, et al. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv:1406.1078.
Chung J, Gulcehre C, Cho K, et al. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv: 1412.3555.
Cohn D A, Hofmann T. 2000. The missing link -- A probabilistic model of document content and hypertext connectivity.. NIPS, 430-436.
Collobert R, Weston J. 2008. A unified architecture for natural language processing: deep neural networks with multitask learning. ICML, 160–167.
Collobert R, Weston J, Bottou L, et al. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research, Issue 12, 2493--2537.
Conneau A, Lample G, Ranzato M, et al. 2017. Word translation without parallel data. arXiv: 1710.04087.
Conneau A, Schwenk H, Barrault L, et al. 2017. Very Deep Convolutional Networks for Text Classification. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Volume 1, 1107-1116.
Cybenko G. 1989. Approximations by superpositions of sigmoidal functions. Mathematics of Control, Signals, and Systems, 2(4), 303-314.
Deerwester S, Dumais S T, Furnas G W, et al. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391–407, 477, 482.
Faruqui M, Dyer C. 2014. Improving vector space word representations using multilingual correlation. Proceedings of EACL 2014.
Fawcett T. 2006. An introduction to roc analysis. Pattern recognition letters, 27(8), 861-874.
Feng F, Wang X, Li R. 2014. Cross-modal retrieval with correspondence autoencoder. ACM Multimedia 2014, 7-16.
Firth J R. 1957. A synopsis of linguistic theory. s.l.:s.n.
Golub G H, Reinsch C. 1970. Singular value decomposition and least squares solutions. Numerische mathematik, 14(5), 403-420.
Goodfellow I, Bengio Y, Courville A. 2016. Deep Learning. s.l.:MIT Press.
Gouws S, Bengio Y, Corrado G. 2015. BilBOWA: Fast Bilingual Distributed Representations without Word Alignments. arXiv: 1410.2455.
Hermann K M, Blunsom P. 2013. Multilingual distributed representations without word alignment. arXiv:1312.6173.
Hinton G E, Srivastava N, Krizhevsky A, et al. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580.
Hochreiter S, Schmidhuber J. 1997. Long short-term memory. Neural computation , 9(8), 1735-1780.
Hofmann T. 1999. Probabilistic Latent Semantic Indexing. Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval, 289-296.
Hotelling H. 1936. Relations between two sets of variates. Biometrika , 28(3/4), 321-377.
Insall M, Rowland T, Weisstein E W. 2018. Embedding. [Online] Available at: http://mathworld.wolfram.com/Embedding.html [Accessed 2018-04-01].
Kalchbrenner N, Grefenstette E, Blunsom P. 2014. A convolutional neural network for modeling sentences. arXiv: 1606.04640.
Karpathy A. 2018. CS231n: Convolutional Neural Networks for Visual Recognition. [Online] Available at: http://cs231n.github.io [Accessed 2018-04-01].
Karpathy A, Fei-Fei L. 2014. Deep visual-semantic alignments for generating image descriptions. CoRR 2014.
Kim Y. 2014. Convolutional neural networks for sentence classification. EMNLP 2014, 1746–1751.
Kingma D P, Ba J L. 2015. Adam: A method for stochastic optimization. ICLR 2015.
Kiros R, Zhu Y, Salakhutdinov R, et al. 2015. Skip-Thought Vectors. arXiv: 1506.06726.
Lebret R, Collobert R. 2013. Word emdeddings through hellinger PCA. arXiv: 1312.5542.
LeCun Y, Bottou L, Bengio Y, et al. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, Volume 86, 2278-2324.
Le Q V, Mikolov T. 2014. Distributed representations of sentences and documents. ICML, 1188-1196.
Levy O, Goldberg Y. 2014. Neural word embedding as implicit matrix factorization. Proceedings of NIPS 2014, 2177-2185.
Levy O, Goldberg Y. 2015. Improving distributional similarity with lessons learned from word embeddings. TACL, Issue 3, 211-225.
Li Y, Yang M, Zhang Z. 2015. Multi-View Representation Learning: A Survey from Shallow Methods to Deep Methods. arXiv: 1610.01206.
Maas A L, Hannun A Y, Ng A Y. 2013. Rectifier Nonlinearities Improve Neural Network Acoustic Models. Proceedings of the 30th International Conference on Machine Learning ., JMLR: 28.
Mikolov T, Chen K, Corrado G, et al. 2013a. Efficient estimation of word representations in vector space. arXiv: 1301.3781..
Mikolov T, Le Q V, Sutskever I. 2013b. Exploiting similarities among languages for machine translation. International Conference on Learning Representations.
Mikolov T, Sutskever I, Chen K, et al., 2013c. Distributed representations of words and phrases and their compositionality. Proceedings of NIPS 2013, 3111-3119.
Mitra B, Craswell N. 2017. Neural Models for Information Retrieval. arXiv: 1705.01509.
Nair V, Hinton G E. 2010. Rectified linear units improve restricted boltzmann machines. s.l., s.n.
Pennington J, Socher R, Manning C D. 2014. Glove: Global vectors for word representation. Proceedings of EMNLP 2014, 1532-1543..
Rehurek R. 2011. Fast and Faster: A comparison of two streamed matrix decomposition methods. arXiv: 1102.5597.
Rumelhart D E, Hinton G E, Williams R J. 1986. Learning representations by back-propagating errors. Nature, Issue 323, 533–536.
Schmidhuber J. 2014. Deep Learning in Neural Networks: An Overview. Technical Report IDSIA-03-14. arXiv:1404.7828.
Smith S L, Turban D H P, Hamblin S, et al. 2017. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. arXiv: 1702.03859.
Socher R, Perelygin A, Wu J Y,. et al. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. Proceedings of EMNLP, 1631-1642.
Weston J, Chopra S, Adams K. 2014. #TagSpace: semantic embeddings from hashtags. Proceedings of EMNLP, 1822-1827.
Zhang X, Zhao J, LeCun Y. 2015. Character-level convolutional networks for text classification. NIPS, 649–657.
Zhao Z, Liu T, Li S, et al. 2017. Ngram2vec – learning improved word representations from ngram co-occurrence statistics. Proceedings of EMNLP 2017, 244—253.
van der Maaten L, Hinton G. 2008. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research, Issue 9, 2579-2605.

馆藏号:

 017/M2018(368)    

公开日期:

 2021-05-26    

面向移动端的用户检索实体抽取系统设计与实现.曹圣明

链接

题名:

 面向移动端的用户检索实体抽取系统设计与实现    

作者:

 曹圣明    

学号:

 1501210487    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 俞敬松    

导师单位:

  软件与微电子学院    

答辩日期:

 2018-05-26    

题目(外文):

 Design and Implementation of Entity Extraction System in User Query For Mobile Terminal Devices    

关键字(中文):

 深度学习 实体抽取 智能语义交互 注意力机制 词向量 硬件适配 模型压缩    

关键字(外文):

 deep learning entity extraction intelligent semantic interaction attention mechnism word embedding hardware adaptation model compression    

文摘:

    实体抽取作为自然语言处理的基本任务,在深度学习兴起之际,又取得了一系列突破性的进展。它作为问答系统、人机对话和机器翻译等任务的基础部分,所起的作用是不可替代的。而近来,随着人工智能的兴起和智能语义交互需求的增加,用户检索中的实体抽取成为很重要的一项功能,它相对于传统命名实体识别具有更宽广的领域需求,更严格的精度和准度需求以及更复杂的用户交互逻辑。我们可以借助实体识别结果,完成一系列的资源请求和服务分发,完成用户的需求,以及引导用户的潜在需求,这是新型的文本交互中非常重要的一环。

    本文基于此目标实现了线上和线下两套系统,其核心系统是实体抽取功能,辅以必要的模式匹配模块,以满足用户的热点需求和修正模型的识别缺陷。关于实体抽取部分,我们主要基于tensorflow框架对模型进行训练、调优和部署。在基线部署上,本文创新性地采用了seq2seq结构,实现了命名实体识别的基础框架;然后根据训练数据规模、输入模块粒度、归一化和注意力机制等对基线模型进行了调优;最后从词向量生成方法、注意力机制和新型模型三个方面对模型的结构进行了改进和优化。最终使得模型的效果提高了10多个点。在算法迭代过程中,我们通过整合模型和词向量增强,取得了最优的结果。最后,我们在微软的命名实体识别公开测试集上进行了模型的测试,并达到了比较好的结果。CNN编码器的实践、注意力机制的深度探讨以及实体去歧模型的调研,将作为本文后续的研究方向。

    其次在移动端的模型部署上,本文还针对硬件和软件两个方面进行了深层次的优化。软件方面,我们分别进行了模型压缩和数据结构优化;硬件方面则进行了依赖分离和硬件适配。总的来说,较好地解决了深度学习模型在移动端部署时所存在的内存占用高、执行效率低等问题,里边的诸多解决方法有很多值得借鉴的地方。

文摘(外文):

As the basic task of Natural Language Processing, Entity Extraction has broken through with the rising of deep learning. Named Entity Extraction has played an irreplaceable role in QA system, interactive chat and machine translation and so on. Recently, with the ascending demands for intelligent semantic intercation and AI's boosting, Entity Extraction has been emerging as a flashpoint in user query precessing. Compared to the traditional named entity recognition, it has a broader fields freedom, more strict limits on precision and recall rate and more sophisticated interactive routines. Based on the extraction results, we can complete a series of resources request and service dispatch , in order not only to meet the users' demands, but also motivate their potential desirements.
    
So, we have implemented two systems, namely online and offline versions, which are composed of entity extraction part and related pattern matching module. The latter module is of necessity to suit the hot queries and compensate for the model's incapability. In this paper, we mainly use tensorflow for model training, fine-tuning and deployment. As to the baseline of our experiments, we used seq2seq architecture instead of the encoder-only method and achieved better results. And then we tuned our baseline in terms of data scale, input granularity, regularization and attention mechnism. At last, we made some changes in model architecture by way of embedding ajustment, attention machinism and novel methods to yiled a better performance. Totally, the new results outperform the older one by over 10 percentage. To sum up, we tested our model on the open MSRA NER dataset  and achieved state-of-art performance. Furthermore, we will continue our research in field of CNN encoder, attention mechnism and entity disambiguation.
    
As for the deployment on mobile devices, we have done some optimizations and improvements in terms of software and hardware. More specifically,the former is mainly composed of  model compression and data structure optimization. As for the latter, we mainly used dependency release and instruction-level adaptation. Overall, we have solved the problem of high memory occupation and low inference rate we may ecounter while deploying deep learning model on mobile devices.

分类号:

 TP3    

论文总页数:

 124    

参考文献数:

 72    

参考文献:
[1] Maha Althobaiti, Udo Kruschwitz and Massimo Poesio. Combining Minimally-supervised Methods
for Arabic Named Entity Recognition. 2015,3.

[2] Jimmy Lei Ba, Jamie Ryan Kiros and Geoffrey E Hinton. Layer Normalization. 2016.

[3] Dzmitry Bahdanau, Kyunghyun Cho and Yoshua Bengio. Neural Machine Translation by Jointly
Learning to Align and Translate. CoRR, 2014, abs/1409.0473. http://arxiv.org/abs/1409.
0473.

[4] Yoshua Bengio, Réjean Ducharme, Pascal Vincent et al. A Neural Probabilistic Language Model.
Journal of Machine Learning Research, 2004-02-05: 1137–1155. http://dblp.uni-trier.de/
db/journals/jmlr/jmlr3.html#BengioDVJ03.

[5] Daniel M. Bikel, Richard Schwartz and Ralph M. Weischedel. An Algorithm that Learns What’s in
a Name. Machine Learning, 1999, 34(1-3): 211–231.

[6] Andrew Borthwick, John Sterling, Eugene Agichtein et al. Description of the MENE named entity
system as used in MUC-7. 1998.

[7] Andrew Borthwick, John Sterling, Eugene Agichtein et al. Exploiting Diverse Knowledge Sources
via Maximum Entropy in Named Entity Recognition. In: 1998: 152–160.

[8] Randall L Calvert. Robustness of the Multidimensional Voting Model: Candidate Motivations, Uncertainty, and Convergence. American Journal of Political Science, 1985, 29(1): 69.

[9] Aitao Chen, Fuchun Peng, Roy Shan et al. Chinese named entity recognition with conditional
probabilistic models. 2006.

[10] Jason P. C. Chiu and Eric Nichols. Named Entity Recognition with Bidirectional LSTM-CNNs.
Computer Science, 2015.

[11] Key Sun Choi, Key Sun Choi and Key Sun Choi. Unsupervised named entity classification models
and their ensembles. In: International Conference on Computational Linguistics, 2002: 1–7.

[12] Michael Collins. Unsupervised Models for Named Entity Classification. In: Joint Sigdat Conference
on Empirical Methods in Natural Language Processing and Very Large Corpora, 1999: 100–110.

[13] Ronan Collobert, Jason Weston, Michael Karlen et al. Natural Language Processing (Almost) from
Scratch. Journal of Machine Learning Research, 2011, 12(1): 2493–2537.

[14] Tim Cooijmans, Nicolas Ballas, César Laurent et al. Recurrent Batch Normalization. CoRR, 2016,
abs/1603.09025. http://arxiv.org/abs/1603.09025.

[15] Chuanhai Dong, Jiajun Zhang, Chengqing Zong et al. Character-Based LSTM-CRF with RadicalLevel Features for Chinese Named Entity Recognition. Springer International Publishing, 2016.

[16] Radu Florian. Named entity recognition as a house of cards: classifier stacking. In: Conference on
Natural Language Learning, 2002: 1–4.

[17] Jonas Gehring, Michael Auli, David Grangier et al. Convolutional Sequence to Sequence Learning.
2017.

[18] Franck Genet and Franck Genet. Tagging unknown proper names using decision trees. In: Meeting
on Association for Computational Linguistics, 2000: 77–84.

[19] Yoav Goldberg. The unreasonable effectiveness of Character-level Language Models.

[20] Michael Gutmann and Aapo Hyv?rinen. Noise-contrastive estimation: A new estimation principle
for unnormalized statistical models. Journal of Machine Learning Research, 2010, 9: 297–304.

[21] Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka et al. A Joint Many-Task Model: Growing
a Neural Network for Multiple NLP Tasks. 2016.

[22] Kaiming He, Xiangyu Zhang, Shaoqing Ren et al. Deep Residual Learning for Image Recognition.
CoRR, 2015, abs/1512.03385. http://arxiv.org/abs/1512.03385.

[23] Geoffrey E. Hinton, Alex Krizhevsky and Sida D. Wang. Transforming Auto-Encoders. 2011, 6791:
44–51.

[24] Geoffrey E Hinton, Sara Sabour and Nicholas Frosst. Matrix capsules with EM routing. In: International Conference on Learning Representations, 2018. https://openreview.net/forum?id=
HJWLfGWRb.

[25] Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by
Reducing Internal Covariate Shift. 2015: 448–456.

[26] Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by
Reducing Internal Covariate Shift. CoRR, 2015, abs/1502.03167. http://arxiv.org/abs/1502.
03167.

[27] Armand Joulin, Edouard Grave, Piotr Bojanowski et al. Bag of Tricks for Efficient Text Classification.
2016: 427–431.

[28] Yoon Kim, Yacine Jernite, David Sontag et al. Character-Aware Neural Language Models. Computer
Science, 2015.

[29] Trausti Kristjansson, Aron Culotta, Paul Viola et al. Interactive Information Extraction with Constrained Conditional Random Fields. In: Nineteenth National Conference on Artificial Intelligence,
Sixteenth Conference on Innovative Applications of Artificial Intelligence, July 25-29, 2004, San
Jose, California, Usa, 2004: 412–418.

[30] Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian et al. Neural Architectures for Named
Entity Recognition. CoRR, 2016, abs/1603.01360. http://arxiv.org/abs/1603.01360.

[31] César Laurent, Gabriel Pereyra, Philémon Brakel et al. Batch Normalized Recurrent Neural Networks.
2015: 2657–2661.

[32] Nicholas Leonard. Language modeling a billion words.

[33] Dongyun Liang, Weiran Xu, Yinge Zhao et al. Combining Word-Level and Character-Level Representations for Relation Classification of Informal Text. In: The Workshop on Representation Learning
for Nlp, 2017: 43–47.

[34] Xuezhe Ma and Eduard Hovy. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF.
2016.

[35] Andrei Mikheev, Claire Grover and Marc Moens. Description Of The Ltg System Used For Muc-7.
In: 1998.

[36] Tomas Mikolov, Kai Chen, Greg Corrado et al. Efficient Estimation of Word Representations in
Vector Space. Computer Science, 2013.

[37] Andriy Mnih and Yee Whye Teh. A fast and simple algorithm for training neural probabilistic
language models. 2012: 419–426.

[38] Richard Morgan, Roberto Garigliano, Paul Callaghan et al. University of Durham: description of the
LOLITA system as used in MUC-6. In: Conference on Message Understanding, 1995: 71–85.

[39] A?ron van den Oord, Sander Dieleman, Heiga Zen et al. WaveNet: A Generative Model for Raw
Audio. CoRR, 2016, abs/1609.03499. http://arxiv.org/abs/1609.03499.

[40] Jeffrey Pennington, Richard Socher and Christopher Manning. Glove: Global Vectors for Word
Representation. In: Conference on Empirical Methods in Natural Language Processing, 2014: 1532–
1543.

[41] Tran Quan, Andrew Mackinlay and Antonio Jimeno Yepes. Named Entity Recognition with stack
residual LSTM and trainable bias decoding. 2017.

[42] Lisa F Rau. Extracting company names from text. In: Artificial Intelligence Applications, 1991.
Proceedings., Seventh IEEE Conference on, 1991: 29–32.

[43] Lisa F Rau. Method for extracting company names from text. US, 1994.

[44] Nils Reimers and Iryna Gurevych. Reporting Score Distributions Makes a Difference: Performance
Study of LSTM-networks for Sequence Tagging. In: Proceedings of the 2017 Conference on Empirical
Methods in Natural Language Processing (EMNLP). Copenhagen, Denmark, 2017-09: 338–348.
http://aclweb.org/anthology/D17-1035.

[45] Sara Sabour, Nicholas Frosst and Geoffrey E Hinton. Dynamic Routing Between Capsules. 2017.

[46] Samuel L. Smith, Pieter-Jan Kindermans and Quoc V. Le. Don’t Decay the Learning Rate, Increase
the Batch Size. CoRR, 2017, abs/1711.00489. http://arxiv.org/abs/1711.00489.

[47] Rohini Srihari, Niu Cheng and Li Wei. A Hybrid Approach for Named Entity and Sub-Type Tagging.
In: Applied Natural Language Processing Conference, 2000: 247–254.

[48] Rupesh Kumar Srivastava, Klaus Greff and Jürgen Schmidhuber. Training very deep networks.
Computer Science, 2015.

[49] Emma Strubell, Patrick Verga, David Belanger et al. Fast and Accurate Sequence Labeling with
Iterated Dilated Convolutions. CoRR, 2017, abs/1702.02098. http://arxiv.org/abs/1702.
02098.

[50] Chen Sun, Abhinav Shrivastava, Saurabh Singh et al. Revisiting Unreasonable Effectiveness of Data
in Deep Learning Era. CoRR, 2017, abs/1707.02968. http://arxiv.org/abs/1707.02968.

[51] Ashish Vaswani, Noam Shazeer, Niki Parmar et al. Attention Is All You Need. 2017.

[52] A Waibel, T Hanazawa, G Hinton et al. Phoneme recognition using time-delay neural networks. IEEE
Press, 1990: 328–339.

[53] Haochang Wang, Tiejun Zhao and Jianmiao Liu. Multi-Agent Classifiers Fusion Strategy for Biomedical Named Entity Recognition, 2008: 311–315.

[54] Dekai Wu, Grace Ngai and Marine Carpuat. A Stacked, Voted, Stacked Model for Named Entity
Recognition. In: Conference on Natural Language Learning at Hlt-Naacl, 2003: 200–203.

[55] Zichao Yang, Diyi Yang, Chris Dyer et al. Hierarchical Attention Networks for Document Classification. In: Conference of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies, 2017: 1480–1489.

[56] Fisher Yu and Vladlen Koltun. Multi-Scale Context Aggregation by Dilated Convolutions. CoRR,
2015, abs/1511.07122. http://arxiv.org/abs/1511.07122.

[57] Suxiang Zhang, Juan Wen and Xiaojie Wang. Word Segmentation and Named Entity Recognition
for SIGHAN Bakeoff3. 2006.

[58] Xiang Zhang and Yann LeCun. Text Understanding from Scratch. CoRR, 2015, abs/1502.01710.
http://arxiv.org/abs/1502.01710.

[59] Xiang Zhang, Junbo Zhao and Yann Lecun. Character-level Convolutional Networks for Text Classification. 2015: 649–657.

[60] Yimin Zhang and Joe F. Zhou. A trainable method for extracting Chinese entity names and their
relations. In: The Workshop on Chinese Language Processing: Held in Conjunction with the Meeting
of the Association for Computational Linguistics, 2000: 66–72.

[61] Junsheng Zhou, Liang He, Xinyu Dai et al. Chinese Named Entity Recognition with a Multi-Phase
Model. 2012.

[62] ZHOU, Junsheng, Weiguang et al. Chinese Named Entity Recognition via Joint Identification and
Categorization. Chinese Journal of Electronics, 2013.

[63] 冯元勇, 孙乐, 张大鲲 et al. 基于小规模尾字特征的中文命名实体识别研究. 电子学报, 2008,
36(9): 1833–1838.

[64] 黄德根, 马玉霞 and 杨元生. 基于互信息的中文姓名识别方法. 大连理工大学学报, 2004, 44(5):
744–748.

[65] 季姮 and 罗振声. 基于反比概率模型和规则的中文姓名自动辨识系统. In: 全国计算语言学联
合学术会议, 2001.

[66] 季姮 and 罗振声. 基于统计和规则的中文姓名自动辨识. 语言文字应用, 2001, (1): 14–18.

[67] 孙茂松, 黄昌宁, 高海燕 et al. 中文姓名的自动辨识. 中文信息学报, 1995, 9(2): 16–27.

[68] 孙茂松 and 邹嘉彦. 汉语自动分词研究评述. 当代语言学, 2001, 3(1): 22–32.

[69] 向晓雯, 史晓东 and 曾华琳. 一个统计与规则相结合的中文命名实体识别系统. 计算机应用,
2005, 25(10): 2404–2406.

[70] 张小衡 and 王玲玲. 中文机构名称的识别与分析. 中文信息学报, 1997, 11(4): 21–32.

[71] 郑家恒, 李鑫 and 谭红叶. 基于语料库的中文姓名识别方法研究. 中文信息学报, 2000, 14(1):
7–12.

[72] 周俊生, 戴新宇, 尹存燕 et al. 基于层叠条件随机场模型的中文机构名自动识别. 电子学报,
2006, 34(5): 804–809.
馆藏号:

 017/M2018(372)    

公开日期:

 2021-05-26    

基于笔画的中文字向量模型设计与研究.赵浩新

链接

题名:

 基于笔画的中文字向量模型设计与研究    

姓名:

 赵浩新    

学号:

 1501211040    

论文语种:

 chi    

专业:

 专业学 - 工程 - 软件工程    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 俞敬松    

导师1单位:

  软件与微电子学院    

论文答辩日期:

 2018-05-26    

外文题名:

 Design and research of Chinese Word Embedding Model Based on Strokes    

关键词:

 笔画 字向量 CBOW Skip-gram    

外文关键词:

 Stroke Word Embedding CBOW Skip-gram    

论文摘要:

数据表示是机器学习领域的基础问题。在机器学习任务中,第一步即输入样本数字化。不同于声音、图像、视频等数字信号,自然语言具有先天的高度结构化、抽象化的特点。因此自然语言任务的首要任务便是将语言文字数字化。

随着技术的发展,语言文字的表征方式不断进步。从最初始的one-hot到如今的分布式表示,词向量包含的信息愈加的丰富。现有的统计模型对于未登录词、低频词依然无法有效的表征。中文词向量研究受限于中文汉字特有的“象形”特征,尚没有一种有效利用笔画信息方法。

本文通过研究word2vec的CBOW框架,提出了一种基于笔画的汉字字向量模型,通过研究笔画组合构造汉字的规律,为中文未登录字、低频字等构造高质量的字向量。模型使用了以下方法:依靠当前汉字的上下文信息,将笔画向量化,学习笔画组合构造汉字的规律;引入注意力机制,丰富笔画构字的规律;采用CNN模型,捕捉汉字部件、合体字信息。与此同时,论文借鉴了生成对抗网络的思想,基于word2vec的Skip-gram模型,尝试以对抗的方式将笔画信息加入到字向量中。

测评工作是对比模型产生的字向量与word2vec、glove产生的字向量在中文分词、命名实体等任务上的准召率。其中在命名实体识别任务中,字向量F1值为81.6%,word2vec、glove分别为80.2%、81.2%。在分词任务中,分别为:96.23%,96.30%、96.31%。

分析表明,论文提出的模型可以有效的捕捉汉字笔画信息,并且有以下两点创新:使用CNN模型捕捉笔画构造汉字规律;引入Attention,计算笔画对汉字的贡献度。

外文摘要:

Data representation is a basic question in Machine Learning. The first step when I come up with a ML task is to digitize the sample data. Being different with the voice、image、video data, natural language is inherently highly structured and abstract. Therefore, the primary task of the natural language task is to digitize the language.

As the development of technology, the representation skill of natural language improves a lot. From one-hot to the distribution representation, the information that word embedding contains is much richer. However, the existing statistical models cannot effectively represent unregistered words and low-frequency words. There isn’t an effective way to use strokes information to digitize the Chinese word, as for the limitation by pictographic" characteristics to Chinese.

We propose a novel model that is Chinese word embedding model based on stroke combination, according to the CBOW.  We aim to provide high quality words embedding for the unseen and low-frequency words through studying the rules of Chinese word. The Stroke2Vec model has following innovations: using context information to digitize strokes, learning the rules of Chinese word combinations, enriching the patterns of strokes by attention mechanism and convolutional neural networks.

 Then we test our models by comparing the results among our model、Word2Vec and GloVe on Named Entity Recognition、Chinese Word Segmentation、Part-Of-Speech tasks. In NER task, F1- scores are 81.6%, 80.2%, 81.2%. In CWS task, F1-scores are 96.23%, 96.30%, 96.31%.

Meanwhile inspired by the GAN, we expand the Skip-gram model of word2vec that try to represent word vector by using strokes information during training.

分类号:

 TP3    

论文总页数:

 52    

参考文献总数:

 47    

参考文献列表:
[1] 石纯一, 黄昌宁, 王家廞. 人工智能原理[M]. 清华大学出版社, 1993.
[2] 常宝宝. 自然语言分析与生成术语简介[J]. 产品安全与召回, 2010(4):19-22.
[3] 张钹. 自然语言处理的计算模型[J]. 中文信息学报, 2007, 21(3):3-7.
[4] Goodstein R L, Harris Z. Mathematical Structures of Language[J]. Mathematical Gazette, 1970, 54(388):173.
[5] Bengio Y, Vincent P, Janvin C. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003, 3(6):1137-1155.
[6] Mnih A, Hinton G. A scalable hierarchical distributed language model[C]// International Conference on Neural Information Processing Systems. Curran Associates Inc. 2008:1081-1088.
[7] Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network based language model[C]// INTERSPEECH 2010, Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September. DBLP, 2010:1045-1048.
[8] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[J]. Computer Science, 2013.
[9] Levy O, Goldberg Y. Neural word embedding as implicit matrix factorization[J]. Advances in Neural Information Processing Systems, 2014, 3:2177-2185.
[10] Goldberg Y, Levy O. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method[J]. Eprint Arxiv, 2014.
[11] Ji S, Yun H, Yanardag P, et al. WordRank: Learning Word Embeddings via Robust Ranking[J]. Computer Science, 2015.
[12] CAO, S.; LU, W.. Improving Word Embeddings with Convolutional Feature Learning and Subword Information. AAAI Conference on Artificial Intelligence, North America, feb. 2017.
[13] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[J]. Computer Science, 2013.
[14] Bojanowski P, Grave E, Joulin A, et al. Enriching Word Vectors with Subword Information[J]. 2016.
[15] Mikolov T A. Statistical Language Models Based on Neural Networks[J]. 2012.
[16] Pinter Y, Guthrie R, Eisenstein J. Mimicking Word Embeddings using Subword RNNs[J]. 2017.
[17] Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation[C]// Conference on Empirical Methods in Natural Language Processing. 2014:1532-1543.
[18] Chen X, Xu L, Liu Z, et al. Joint learning of character and word embeddings[C]// International Conference on Artificial Intelligence. AAAI Press, 2015:1236-1242.
[19] Lecun Y. LeNet-5, convolutional neural networks[J].
[20] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.
[21] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[J]. Computer Science, 2014.
[22] 许慎. 说文解字校订本[M]. 凤凰出版社, 2004.
[23] Luong M T, Pham H, Manning C D. Effective Approaches to Attention-based Neural Machine Translation[J]. Computer Science, 2015.
[24] Lin, Z., Feng, M., Santos, C. N. dos, Yu, M., Xiang, B., Zhou, B., & Bengio, Y. (2017). A Structured Self-Attentive Sentence Embedding. In ICLR 2017.
[25] Parikh, A. P., T?ckstr?m, O., Das, D., & Uszkoreit, J. (2016). A Decomposable Attention Model for Natural Language Inference. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
[26] Cheng, J., Dong, L., & Lapata, M. (2016). Long Short-Term Memory-Networks for Machine Reading. arXiv Preprint arXiv:1601.06733.
[27] Paulus, R., Xiong, C., & Socher, R. (2017). A Deep Reinforced Model for Abstractive Summarization.
[28] Daniluk, M., Rockt, T., Welbl, J., & Riedel, S. (2017). Frustratingly Short Attention Spans in Neural Language Modeling. In ICLR 2017.
[29] Liu, Y., & Lapata, M. (2017). Learning Structured Text Representations. In arXiv preprint arXiv:1705.09207.
[30] 梁南元. 书面汉语的自动分词与一个自动分词系统—CDWS[J]. 北京航空航天大学学报, 1984(4):101-108.
[31] 张华平, 刘群. 基于N-最短路径方法的中文词语粗分模型[J]. 中文信息学报, 2002, 16(5):1-7.
[32] Meishan Zhang, Yue Zhang, and Guohong Fu. Transition-based neural word segmentation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2016, pp. 421–431.
[33] Xue N, Shen L. Chinese Word Segmentation as LMR Tagging[J]. Proc of Sighan Workshop, 2003:176--179.
[34] Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015.
[35] 李航. 统计学习方法[M]. 清华大学出版社, 2012.
[36] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.
[37] Cho K, Van Merri?nboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv:1406.1078, 2014.
[38] Gers F A, Schmidhuber J. Recurrent nets that time and count[C]//Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on. IEEE, 2000, 3: 189-194.
[39] Yao K, Cohn T, Vylomova K, et al. Depth-gated recurrent neural networks[J]. arXiv preprint, 2015.
[40] 黄昌宁, 赵海. 中文分词十年回顾[J]. 中文信息学报, 2007, 21(3):8-19.
[41] Gehring J, Auli M, Grangier D, et al. Convolutional sequence to sequence learning[J]. arXiv preprint arXiv:1705.03122, 2017.
[42] Kalchbrenner N, Grefenstette E, Blunsom P. A convolutional neural network for modelling sentences[J]. arXiv preprint arXiv:1404.2188, 2014.
[43] Kim Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv:1408.5882, 2014.
[44] Hu B, Lu Z, Li H, et al. Convolutional neural network architectures for matching natural language sentences[C]//Advances in neural information processing systems. 2014: 2042-2050.
[45] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in neural information processing systems. 2014: 2672-2680.
[46] Goodfellow I. NIPS 2016 tutorial: Generative adversarial networks[J]. arXiv preprint arXiv:1701.00160, 2016.
[47] Cao S, Lu W, Zhou J, et al. cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information[J]. 2018.
馆藏号:

 017/M2018(401)    

公开日期:

 2018-05-26    

英语智能写作个性化辅助系统的设计与实现.赵恩辉

链接

题名:

 英语智能写作个性化辅助系统的设计与实现    

姓名:

 赵恩辉    

学号:

 1501210804    

论文语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 公开    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师1姓名:

 俞敬松    

导师1单位:

 软件与微电子学院    

论文答辩日期:

 2018-05-26    

外文题名:

 DESIGN AND IMPLEMENTATION OF ENGLISH INTELLIGENT PERSONALIZED ASSISTANT WRITING SYSTEM    

关键词:

 词网络 句子推荐 文章推荐 英语写作分级 英语写作辅助系统    

外文关键词:

 Vocabulary Network Sentence Recommendation Article Recommendation English Writing Level Computer-assisted English Writing    

论文摘要:

在生活工作的交流沟通和英语学习中,英语写作起的作用越来越重要。一方面只有丰富、准确的描述文章内容才能有效的传递思想和信息;另一方面对于母语为非英语的英语学习者来说,写作也可以提高英语水平,大量写作这也是“写长法”英语教学理论的基本要求。但是写作对于英语学习者来说却是一件很难的事情,针对写作困难的问题,出现了很多辅助写作系统。区别于这些系统,本系统是基于学生个人学习状况和写作水平,从词、句子、篇章多个维度进行帮助写作的个性化辅助系统。
词辅助模块首先建立多种词典,抽取词与词之间的多种关系构建词网络,然后基于句法分析,文章关键词技术和学生历史写作情况选取待扩展词,基于词网络关系推荐候选词,对待扩展词进行替换。蓝思(Lexile)在英语分级时,仅考虑了词频和句子长度,然后做了回归模型。本系统在判断学生英语写作水平和篇章难度时,不仅考虑了篇章的平均句子结构复杂度,平均句子长度,篇章词频取对数求平均值,句子数量,总单词数,最大单句结构复杂度,最长句子长度七个维度,并将这些文本特征与文本内容结合,通过自然语言处理和深度学习的算法模型及其他相关技术实现。句子辅助模块,首先判断学生的英语写作水平, 针对学生的写作水平推荐给学生适合其难度的句子,推荐句子时主要基于短文本相似性的相关技术实现,通过推荐参考例句扩展写作思路,提高句子表达的准确性和多样性。 文章辅助模块,首先判断学生英语写作水平和文章难度,推荐与学生英语写作水平一致的相似主题的写作范文,来扩展写作思路,推荐主题范文时,主要通过计算主题的相似性来实现。
本文也实现了词模块,句子模块以及篇章处理模块功能界面,并描述了各模块语料库的组成和构建过程。通过待扩展词采纳率对词模块进行了测评,采纳率为71.56%;通过人工标注数据对写作难度分级进行了准确率的测评,准确率为93.7%;通过人工标注数据对句子模块的准确度做了测评,取1个相似句命中准确率为84.83%,取6个相似句命中准确率为97%;通过推荐同主题范文的点击率对篇章模块做了测评,点击率为29.375%。本文也给出了各系统模块及系统架构的整体设计,并通过人工打分的方式对系统做了评估,总分5分,各方面整体评价在3到4分之间。这些指标体现了本系统各方面功能的精确程度,是系统功能和个性化准确程度的最主要影响因素。

 

外文摘要:

English writing plays an increasingly important role in daily life, especially in work communication and English learning. On the one hand, it is necessary to enrich and accurately describe the contents of the article to convey ideas and information, on the other hand, writing is the most important strategy in improving English for non-native English speakers. And a lot of writing is an essential basic requirement of "Length Approach" that is an English teaching theory. However, writing is a very difficult thing for English learners, there are many auxiliary writing software and systems to solve the problem for writing difficulties. Different from these systems and software, this system is based on every individual learning status and writing level of the students. It is a personalized Writing Assistant System that helps students to write from word level to sentence level and topic level. 
In the word module, firstly, we construct kinds of dictionaries, extract relations
between words and establish a vocabulary network based that. Then select expansion words to be replaced based on syntactic analysis, the keyword technology and students’ log of writing. At last, we select candidate words to replace the extended words according to the vocabulary network.  The Lexile of a text is established through a regression model that just considers sentence length and word frequency. Unlike that, when determining the students' English writing ability and the level of a text, we consider 7 aspects: the average sentence structural complexity of the article, the average sentence length, the average word frequency of the article, the sentences number of the article, the total words number of the article, and the structural complexity of the longest sentence in the article, the total words number of the longest sentence in the article. We combine these features with textual content to implement the function through natural language processing and deep learning algorithm. In the sentence module, firstly, we determine the level of the students' English writing and the level of every sentence in the database, then recommend the sentences which suit students to broad their mind and improve the accuracy and vividness of writing that is implemented based on the similarity calculation of short text. In the article module, firstly, we determine the level of the students' English writing and the level of every document in the database, then recommend the documents which suit students to broad their mind and improve vividness of writing that is implemented based on the similarity calculation of text topic. 
This system provides interface for users to use each module. We describe the construction process of corpus of each module, the overall design of each module and the architecture of system in this paper. The word module is evaluated with adoption rate of words which are expanded, and the adoption rate is 71.56%. The writing level is evaluated with accuracy, and the accuracy is 93.7%. The sentence module is evaluated with accuracy, and the accuracy is 84.83% with selection of top 1 similar sentence and 97 % with selection of top 6. The article module is evaluated with click rate, and the click rate is 29.375%. We also design the evaluation of system performance with users’ manual scoring, and most aspects score 3 to 4 points. 

 

分类号:

 TP3    

论文总页数:

 73    

参考文献总数:

 45    

参考文献列表:
[1] 王初明.论外语“写长法”的教学理念[A].北京:中央编译出版社, 2002.
[2] 袁秀凤.近十年英语“写长法”教学模式研究综述[J] .宁德师范学院学报(哲学社会科学版) , 2013 (3) :108-111.
[3] 占飞. 计算语言学领域英文辅助写作系统[D]. 哈尔滨工业大学, 2011.
[4] Chen M H, Huang S T, Hsieh H T, et al. FLOW: A First-Language-Oriented Writing Assistant System[C]//Proceedings of the ACL 2012 System Demonstrations. Association for Computational Linguistics, 2012: 157-162.
[5] 孔行. 基于主题推荐的辅助写作系统[D]. 哈尔滨工业大学, 2015.
[6] 吴伟成,周俊生,曲维光. 基于统计学习模型的句法分析方法综述[J]. 中文信息学报, 2013 , 27(3): 9?19.
[7] Quattoni A, Wang S, Morency L, et al. Hidden conditional random fields[J]. IEEE Trans. PAMI 29(10),1848–1852 (2007).
[8] Page L, Brin S, Motwani R, et al. The PageRank Citation Ranking: Bringing Order to the Web[R]. Technical report, Stanford Digital Library Technologies Project,1998.
[9] Mihalcea R, Tarau P. TextRank: bringing order into texts[C]// Proc Conference on Empirical Methods in Natural Language Processing,2004:404-411.
[10] 刘知远. 基于文档主题结构的关键词抽取方法研究[R].清华大学, 2011.
[11] Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model[J]. Journal of Machine Learning Research,3:1137-1155,2003.
[12] Mikolov T, Sutskever I,Chen K, et al. Distributed Representations of Words and Phrases and their Compositionality[C]//International Conference on Neural Information Processing Systems,2013:3111-3119.
[13] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[J]. Computer Science,2013.
[14] Morin F, Bengio Y. Hierarchical Probabilistic Neural Network Language Model[J]. Aistats, 2005.
[15] Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C]// Empirical Methods in Natural Language Processing (EMNLP), 2014 :1532-1543.
[16] Le Q, Mikolov T. Distributed Representations of Sentences and Documents[C]// International Conference on International Conference on Machine Learning,2014: II-1188-II-1196.
[17] Tsoi A C, Tan S. Recurrent neural networks: A constructive algorithm, and its properties[J]. Neurocomputing.1997,15 (3–4) :309-326.
[18] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation,1997,9 (8) :1735-1780.
[19] Dey R, Salem F M. Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks[C]//IEEE International Midwest Symposium on Circuits & Systems,2017 :1597-1600.
[20] Lipton Z C, Berkowitz J, Elkan C. A Critical Review of Recurrent Neural Networks for Sequence Learning[J]. Computer Science,2015.
[21] Mikolov T, Kombrink S, Burget L, et al. Extensions of recurrent neural network language model[C]//IEEE International Conference on Acoustics,2011, 125 (3) :5528-5531.
[22] Neculoiu P, Versteegh M, Rotaru M. Learning Text Similarity with Siamese Recurrent Networks[C]//Repl4nlp Workshop at Acl,2016.
[23] Mueller J, Thyagarajan A. Siamese Recurrent Architectures for Learning Sentence Similarity[C]//Thirtieth Aaai Conference on Artificial Intelligence,2016 :2786-2792.
[24] Sutskever I, Vinyals O, Le Q. Sequence to Sequence Learning with Neural Networks[C]//Neural Information Processing Systems,2014.
[25] Chung J, Gulcehre C, Cho K H, et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[J]. Eprint Arxiv,2014.
[26] Lowe R, Pow N, Serban I, et al. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems[J]. Computer Science,2015.
[27] Rush AM, Chopra S,Weston J. A Neural Attention Model for Abstractive Sentence Summarization[J]. Computer Science,2015.
[28] Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation[J]. J Machine Learning Research Archive,2003,3 :993-1022.
[29] 张龙凯,王厚峰.文本摘要中的句子抽取方法研究[J].中国计算语言学研究前沿进展,2011.
[30] Erkan, Radev, Dragomir R. LexRank: graph-based lexical centrality as salience in text summarization[J]. Journal of Qiqihar Junior Teachers College,2012,22:2004.
[31] Smith M, Turner J, Sanford-Moore E, et al. The Lexile Framework for Reading: An Introduction to What It Is and How to Use It[J]. Springer Singapore,2016.
[32] Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016:785-794.
[33] Bojanowski P, Grave E, Joulin A, et al. Enriching Word Vectors with Subword Information [J]. arXiv preprint arXiv:1607.04606, 2016.
[34] Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[J]. arXiv preprint arXiv:1607.01759,2016.
[35] Schuster M, Paliwal KK. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing,2002,45(11):2673-2681.
[36] Kim Y. Convolutional Neural Networks for Sentence Classification[J]. Eprint Arxiv. 2014.
[37] Ketkar N. Convolutional Neural Networks[J]. Apress,2017.
[38] WiKi. WordNet. https://en.wikipedia.org/wiki/WordNet.
[39] XOxford University. British National Corpus[DB]. https://corpus.byu.edu/bnc/.
[40] Hilary N, Sheena G, Paul T, et al. British Academic Written English Corpus[DB]. https://www.coventry.ac.uk/research/research-directories/current-projects/2015/british-academic-written-english-corpus-bawe/.
[41] Manning C D, Surdeanu M, Bauer J, et al. The Stanford CoreNLP Natural Language Processing Toolkit[C]//Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, At Baltimore, Maryland,2014.
[42] Corpus of Contemporary American English (COCA) [DB]. https://corpus.byu.edu/coca/.
[43] Jurafsky D, Martin J H. Speech and Language Processing[G]. http://web.stanford.edu/~jurafsky/slp3/,2018.
[44] Christopher M. Bishop. Pattern Recognition and Machine Learning [M]. Springer,2007.
[45] Goodfellow I, Bengio Y, Courville A. Deep Learning [M]. The MIT Press,2016.
馆藏号:

 017/M2018(402)    

公开日期:

 2018-05-26    

基于深度学习的英文手写识别的设计与实现.王文杰

链接

题名:

 基于深度学习的英文手写识别的设计与实现    

作者:

 王文杰    

学号:

 1501210713    

语种:

 chi    

专业:

 专业学 - 工程 - 计算机技术    

公开时间:

 3年后    

培养层次:

 硕士    

学位:

 工程硕士专业学位    

培养单位:

 北京大学    

院系:

 软件与微电子学院    

导师姓名:

 俞敬松    

导师单位:

 软件与微电子学院    

答辩日期:

 2018-05-26    

题目(外文):

 Design and Implementation of English Handwritten Recognition Based on Deep Learning    

关键字(中文):

 手写识别 卷积神经网络 深度学习    

关键字(外文):

 Handwritten Recognition Convolutional Neural Networks Deep Learning    

文摘:

文字是人类进入文明社会的重要标志之一,推动着人类社会的进步和发展。在科技发达的今天,将这些纸上的古老符号转化成现代计算机中能够识别、存储和检索的内容有着重要意义。近些年来,随着深度学习技术的飞速发展,使用计算机对单个英文字符的识别已经达到了极高的准确率。但是,由于个人书写风格的差异、字符之间笔画的粘连等问题,对整个手写英文字符串进行识别仍是一个很有挑战性的问题。
本文主要针对手写识别领域两个主要任务——脱机手写识别和联机手写识别,进行了研究和模型设计。在联机手写识别中,对数据集中的点坐标和时间戳等信息提取出了5种特征通道,然后使用CRNN架构设计了深度学习模型。在脱机手写识别数据处理中,采用了随机位置、对比度变换和增加高斯噪声等几种不同数据增强方法;在脱机手写识别模型中,本文设计了基于“CNN+Seq2Seq”模型和基于全卷积的手写识别模型。最后,使用“垂直投影”算法对图片进行分行,将论文中的模型运用到了一个实际的工程项目之中。
在联机手写识别中,通过对测试集进行评估,基于CRNN模型达到了CER 5.7%。把不同的特征进行比较,发现“点的转角”这个特征对结果影响最大。在脱机手写识别中,本文基于“CNN+Seq2seq”模型能够达到CER 6.1%。基于全卷积的手写识别模型能达到CER 8.6%,虽然效果比基于“CNN+Seq2Seq”模型略差,但是在训练和预测时间上降低了40%。最后,通过对比不同的数据处理方法的结果,发现随机位移能够很好的防止过拟合,提高模型识别准确率。

文摘(外文):

Text is one of the important signs that human beings enter the civilized society. It promotes the progress and development of human society. Nowadays, with the development of science and technology, it is of great significance to transform the ancient symbols on these papers into the contents that can be identified, stored and retrieved in modern computers. In recent years, with the rapid development of deep learning, the recognition of single English character by computers has reached a high accuracy rate. However, it is still a challenging problem to recognize the whole handwritten English string due to the differences of personal writing styles and the adhesion of strokes between characters.
In this paper, two main tasks in handwritten recognition field, offline handwriting recognition and online handwriting recognition, are studied and designed. In online handwriting recognition, we extract five feature channels from the coordinates and timestamp information of points in the dataset, and then design a deep learning model using CRNN architecture. In off-line handwriting recognition data processing, we use several different data augmentation methods, such as random displacement, contrast transformation and increasing Gaussian noise; In offline handwriting recognition model, we design two models, one based on "CNN + Seq2Seq" model, and the other based on fully convolution model. Finally, we use the "vertical projection" algorithm to divide the pictures, and apply the model in this paper to an actual project.
In online handwritten recognition, by evaluating the test set, the model based on "CRNN" can reach 5.7% CER. By comparing the different features, we find that "point of rotation angle" has the greatest influence on the results. In offline handwritten recognition, the model based on "CNN+Seq2seq" can reach 6.1% CER. The handwritten recognition model based on fully convolution can reach 8.6% CER. Although the result is slightly worse than the model based on "CNN+Seq2Seq", the training and predicting time is reduced by 40 %. Finally, by comparing the results of different data processing methods, we find that random displacement can well prevent over-fitting and improve the accuracy of recognition.

 

分类号:

 TP3    

论文总页数:

 69    

参考文献数:

 54    

参考文献:
[1] 刘排排. 空中手写字符串识别算法研究[硕士学位论文]. 北京交通大学, 2015.
[2] 武裕朴, 赵景台. 印刷体汉字识别方法综述[J]. 机器人, 1981, 3(5):6-12.
[3] Jain