PKU CAT

论文排版

2020-02-05T10:23:17+00:00

本文主要讲下北大硕士论文排版时遇到的几个问题和解决办法，主要参考了软微官网提供的硕士论文模板及写作指南。

自动多级列表与题注

软微给的模板没有给标题设定自动列表，这样不便于自动编号以及生成题注。

在Word中定义新的多级列表，将级别链接到样式，级别1到3对应于标题1到3。注意标题1到3的缩进都是一样的。论文指南要求编号和标题之间空一汉字符，这里我没有进行设置，选择手动添加，详见本文第6条。

同时修改页眉，用域代码添加编号和章节名，中间隔一个汉字符。

因为北大硕士论文要求一级标题用“第一章”这样的形式，生成的题注会变成图一.1这样的形式。

这里有一个折中的办法，就是先使用数字编号（勾选正规形式编号），这样生成的题注就是图1.1这样的了。要导出论文时把编号再改成中文的，这时题注的域还没有更新，仍然是数字编号。但导出PDF时会自动更新所有域，还是会变回去。这时我们可以选中所有域，使用快捷键CTRL+F11锁定域，避免其进行更新。要解锁则使用快捷键CTRL+SHIFT+F11，要更新全部域使用快捷键F9。
样式

论文中的样式可以分为几种：标题1到3、论文正文、图、表、图注、表注、参考文献、表达式。最好单独设置这些样式。

目录的样式可以使用自定义目录样式进行设置。

存在一个表的段前段后距离的设置问题，因为表样式是给表里的文字设置的，按照要求设置后表本身和论文正文的间距过窄。我只有手动添加换行。
公式

软微给的模板是doc格式，最新版本的Word打开后不能添加公式。一个折中方法是新建一个docx格式的文档用于写公式，之后再复制到doc中，公式会另存为图片。
代码

论文中需要插入代码，我选择用http://www.planetb.ca/syntax-highlight-word这个网站对代码进行高亮，然后直接复制到Word里使用。但代码块和正文之间的间距在写作指南里没有说明，我暂时设了一个0.5行。
参考文献

参考文献有两种格式：顺序编码制和著者-出版年制。“顺序编码制” 是指正文中索引文献时，用顺序编号的方法标注文献。文献序号放“[ ]” 内，以上标方式标注在索引位置。 “著者-出版年制” 是指索引文献处用文献著者和出版年度标注文献，一般著者和出版年度放“（） ” 内，以逗号分隔，标注在索引位置。以“顺序编码制” 索引文献时，其参考文献应按索引对应编号顺序著录。以“著者-出版年制” 索引文献时，参考文献应按文种分类著录，按著者字母顺序排序，中文文献放前方。

参考文献的条目格式都遵循国家标准GB/T 7714—2005。

我个人倾向于使用顺序编码。

可以使用EndNote管理文献，网上有一份国家标准的参考文献样式可以使用：https://cnzhx.net/blog/endnote-output-style-cnzhx/。注意中文三个以上作者需要把et al改成等，然后最后参考文献列表的编号是纯文本，不是用的Word中的编号，会有对齐问题，最好把文字编号删了然后用Word的编号功能重新进行编号。
序号后的空格问题

论文指南要求目录、标题、题注（图序、表序）中的序号和文字之间留一个汉字符。一个汉字符对应一个全角空格，两个英文空格。另外，指南还要求题注的标签和序号之间不能有空格。需要注意的是Word对汉语的支持不好，添加的题注标签和序号之间必须有空格。

这时只有根据样式先选中所有这类文本后再进行空格的删除或者替换操作。

替换示例：

原文：图 1.1 题注文本（English Text）

替换后：图1.1　题注文本（English Text）

“图”后面的空格比较好处理，直接搜索图加空格就行。比较麻烦的是序号后面的单个英文空格替换为两个空格的问题。如果题注包含英文，替换空格时比较麻烦。可以先将英文单词间的空格替换为一个特殊符号，替换好后再把特殊符号换回空格。替换代码（使用通配符）：
```
 搜索：([a-zA-Z]) ([a-zA-Z])
 替换：\1PLACEHOLDER\2
```
脚注

注意脚注的横线分隔符的缩进。具体修改方法参见：Word 2010如何修改脚注上方的横线?

另外脚注内容需要使用较为规范的内容，例如下面是引用网址的示例。
版权声明与原创性声明页

版权声明与原创性声明页使用北大系统上下载的PDF。将Word导出为PDF后再用Adobe Acrobat等PDF编辑软件替换页面。

注意事项：

新的章节应该出现在奇数页
文中的括号使用全角括号
参考文献里的逗号使用半角，后面跟随一个空格或者使用全角逗号
编号列表中的文本可以不设置悬挂缩进
注意题注是用点连接序号的，不是横线
如果一个图由两个或两个以上分图组成时，各分图分别以(a)、 (b)、 (c)……作为图序，并须有分图名。
图表排版时不应使得正文有大片空白
待补充……

下载：

修改后的模板：模板.doc

修改后的EndNote用国家标准参考文献样式：geebinf modified by zz.ens

CAT历年论文

2019-07-06T23:07:17+00:00

爬取了北大的毕业论文仓库，根据导师姓名检索，可能有其它专业的论文。

这篇文章主要用于全文检索，查看起来不是很方便。

2019-05-30
- 基于深度学习的自动句法纠错研究.黄浩洋
2019-05-29
- 基于自然语言处理的学生英文检错规则抽取研究.杨越
- 基于深度学习的视频行为识别研究.常志勇
- 辅助写作的语料库查询系统设计与实现.胡盖蕾
- 基于文献的中医经方靶点预测关键技术研究.张琢
- 基于网络表示学习的科技简报自动生成关键技术研究.张越
- 基于文本分析与计算的科技政策扩散关键技术研究.张丽颖
- 基于蒙特卡罗算法的皮肤病诊疗路径关键技术研究.张瑾
- 面向领域的先进技术侦测关键技术研究.张茜
- 基于层次条件变分自编码器的政府公文自动生成系统的设计与实现.邓雅妮
- 一种英语写作知识点推荐策略.Tianfang Gao
- 富信息古籍整理平台的设计与研究.刘晓娟
- 公文辅助阅读平台的设计与实现.何寒松
- 多功能古籍协同研究平台的研究与设计.邓娟
2019-05-27
- 大学英语写作学习平台游戏化设计研究与实践.戴欣怡
- 中文文本分析量化指标体系的研究与应用.杨雨萌
- 医学英语词典的研究与设计.尹梦佳
- 多维度智能英语词汇学习知识库研究.屠少辉
- 法律英语词汇学习系统研究与设计.包珍
- 基于思考帽理论的合作探究教学设计与实证.陈钗平
- 自适应英语写作系统社交模块的设计与实践.陈陟
- 面向考试应用的托福积极词汇学习微信小程序的设计.黄郭钰慧
- 出版审校流程中专业审校与目标读者审校的对比研究——以《培养小极客》为例.张心彧
- 京剧回译中的文化还原策略——以《伶界大王：1870-1937年京剧再造时期的演员与公众》为例.汪楚楠
- 翻译中的原型效应转移策略探究——以《推和敲》为例.杨舒涵
- 针对英语词汇石化问题的自适应词块系统研究与设计.王丽君
- 海外汉学著作精准回译策略研究——以《中国武术：从古代到21世纪》为例.钱康
- 基于语料库方法研究G.K.切斯特顿的反犹问题.窦蕾
2019-05-24
- 英文汉学著作的汉译：回译和变译.房一品
- 《译者的取与舍——简析英译汉的异化归化策略》.江皓如
2019-05-23
- 汉语“V-的”结构中的“的”及其锚定功能.叶永青
2019-05-20
- 供应链金融下中小企业信用评级研究 -以工程机械行业为例.孙浩
- 国际视角下建筑行业协会合作对建筑职业培训效果影响的研究.田志伟
2018-11-30
- 中国技术写作认证考试设计与实证.阮羽
- 医学英语词汇学习系统研究与设计.荣岩
- 基于多模态理论和图式理论的雅思听说学习系统的研究与设计.周璇
- 基于模拟方法的技术写作同源开发教学研究.杨爱萍
2018-06-06
- 指称理论对于生成语法的必要性.张振宝
2018-05-27
- 英汉翻译中的变通与忠实.张英杰
2018-05-26
- 基于深度学习的文本语句扩展系统的设计与实现.于昌和
- 基于多人在线战术竞技游戏的虚拟团队数据分析与研究.曾伊蕾
- 基于神经网络的影视剧向量表示模型.隋春宁
- 面向移动端的用户检索实体抽取系统设计与实现.曹圣明
- 基于笔画的中文字向量模型设计与研究.赵浩新
- 英语智能写作个性化辅助系统的设计与实现.赵恩辉
- 基于深度学习的英文手写识别的设计与实现.王文杰
- 基于机器学习的作文分析系统设计与实现.李海涛
- 基于深度学习的英语语法纠错系统的设计与实现.陈宏业
- 基于深度学习的英语口语发音评测系统的设计与实现.吴琼
- 面向英语智能学习的知识库系统的设计与实现.梁彪
- 基于深度学习的实体关系抽取的研究.唐弘毅
- 数据驱动的海洋意识评价指标体系的构建与实证研究.王一博
- 基于深度神经网络的弱监督人脸识别方法研究.于程程
- 基于paraphrase generation的英语作文辅导功能的后端设计和实现.万泽宇
- 面向教育类视频的摘要生成技术研究与实现.帅远华
- 面向专业领域的自动综述关键技术研究.涂梦
2018-05-25
- 面向显隐式语法教学的学习材料加工和教学优化研究.林凤怡
- 基于支架式理论的技术文档写作教学研究.闫晓宁
- 中式英语的自动检测研究与应用.于婵
- Keystroke logging 评估的技术写作和术语教学研究.钟梦俐
- 服务于中小学教师的在线研修系统的设计与实现.李贺
- 翻转课堂教学的游戏化设计和实证研究.吴丹
- 基于语块和数据分析的高中英语写作一体化的教学研究.迟蕊沂
- 面向读写一体化的英语写作系统的研究与设计.刘玥杉
- 官话方言翻译黑人英语的策略研究——以《绝非虚构：我的人生教训》为例.徐靖凯
- 葡萄酒文化的通俗化翻译——以《新索斯比葡萄酒大百科》为例.方一凡
- 汉学社科类著作中本源概念的翻译研究——以《珠三角的女儿》为例.李文婷
- 基于增强型电子书的音乐剧著作的深度翻译策略研究——以《美国音乐剧的秘密生活》为例.乌天骄
- 科幻虚构词的偏离手段及翻译策略——以《神秘博士：耀眼的黑暗》为例.宋雅雯
- 复杂历史文本翻译中背景知识图的设计与应用——以《成吉思汗的宗教思想：世界征服者给予我们的宗教自由》翻译为例.刘珈池
- 面向海外粉丝型受众的国产剧字幕反常化翻译策略研究.李梅娟
- 技术写作在分布式敏捷开发中的沟通管理研究–以K公司为例.李慧敏
- 西方艺术史书籍中文化因素的翻译策略—以《20世纪的艺术》为例.马璇
2018-05-24
- 时态成分对句子的语义贡献.郑莉莉
2018-05-19
- 《公安基层派出所激励机制设计与应用》.吴书光
2017-11-29
- 针对写前准备阶段的英语写作训练系统前端设计和实现.王勤晓
- 情感化设计在技术文档中的应用研究.余瑶
- 引申译法在英汉翻译中的应用——以《我与克鲁克的边境军旅生活》为例.陈纯
2017-11-20
- 基于任务型和游戏化的高职词汇教学研究.李尚
2017-05-21
- H市派出所民警工作特征对其职业倦怠的影响研究.王翔宇
- 一个政府创新券申请书分析系统的设计与实现.周世洋
- 基于自然语言处理技术识别假新闻的研究.黄颖彪
- 关于中文输入法准确率方面的研究.郑静
- 最佳教学实践指导下的英语听力学习系统的前端设计与实现.杨超
- 基于PGIS的某市警务信息研判系统的设计与实现.刘聪
- 搜索引擎查询短语中的命名实体识别方法研究.马胜节
- 对话式交互中问答系统的设计与实现.岳聪
- 基于HBase的HAWQ查询优化研究与实现.谢钧涛
- 分布式实时流处理系统的性能和可靠性的研究和优化.吕云松
- 搜索广告中非对称先验的有监督LDA模型的设计与实现.章玲通
- 基于深度增强学习的多轮对话系统设计与实现.徐粲
- 基于句法分析的英语型式自动识别.刘潇杨
- 术语自动抽取系统的设计与实现.石朋欣
- 面向自适应教学的英语口语资源加工方法的设计与实现.阙颖
- 最佳教学实践指引下的英语词汇学习系统前端设计与实现.徐冉
- A省公安边防信息安全管理风险评估模型设计与应用.邵健健
- 垂直领域专家观点可信度关键技术研究与实现.王晴旭
- 面向科学文献的比较式摘要生成技术研究与实现.杨雨青
2017-05-19
- 语音技术传播可用性研究——以热线客服为例.高岚
- 在线合作批注翻译教学研究.肖龙
- 从认知负荷的角度探究弹幕对在线学习者的影响.路康虹
- 以问题为导向的翻转课堂自学效果的应用研究.龙翔
- 基于规则方法的对外汉语“语序错误”检测研究.张璐瑶
- 基于数据驱动和形成性评估的高中英语词汇教学研究.程鑫
- 基于wiki的小组协作式翻译教学研究.代碧薇
- 中国网络玄幻小说海外译介研究.邓平博
- 美国嘻哈乐歌词翻译研究.周天亮
- 政治词汇的汉译策略——以《分裂社会的城与魂》一书的翻译为例.张咪
- 基于行为特征和数据分析的外语词汇学习模型研究.赵海威
- 常见易错搭配辨析及预测.耿思思
- 面向技术文档写作修改过程的书面沟通优化方法的设计与实现.林梦姣
- 西方服饰术语的翻译策略——以《现代时尚历史：1850-2010》为例.李亚楠
- 异语写作的无本回译研究——基于《中国营养疗法：中医营养学》一书的翻译.陈培琳
- 中美高校国际形象片叙事对比研究.段夕超
- 基于语义场的金融词汇翻译策略——以《股权众筹投资指南》为例.鲁晨
- 翻译修改过程中技术审校和语言审校的对比研究.李雅琦
- 基于国内创业者需求的海外创业公司新闻编译策略研究.梁欣
2016-11-30
- 出入境货运船舶边检管理风险评估系统的设计和实现.陈嘉慧
- 体裁分析视角下南海事件新闻报道研究.韩易菲
2016-05-31
- 基于语料库的科技文翻译腔研究.张琳
- Dota类对战游戏台词翻译策略研究.宋军
- 科学修辞劝说视角下的科普汉译策略——以 Debunk It 为例.孙庆娟
- 移动网页式说明书中不同程度渐进式呈现及其效果研究.朱灿华
- 批评话语分析视域下的中美英智库中“一带一路”评论对比研究.肖杰
- 基于阅读动机的面向初中生读者的科普文本编译研究.马荣荣
- 基于形成性评估的高中英语词汇评估方法设计及有效性研究.梁云辉
- 教育术语的翻译策略探究—以《“伟大”美国学校体系的死与生》一书的翻译为例.党凡钰
- 美国大选电视辩论中的字幕翻译策略研究.邵巾芮
- 复杂并列结构的汉译策略研究——以《优秀的绵羊》为例.陈子怡
2016-05-30
- 本地化项目管理中翻译流程优化的研究.杨恒杰
- 基于技术接受模型的计算机辅助翻译软件用户接受行为和培训研究.荆斌
- 以过程监控为核心的翻译查证能力教学研究——以CATTP平台为例.李雅慧
- 认知负荷理论指导下教材翻译的翻译策略.杨冰莹
- 语块在翻译教学中的应用——基于眼动追踪的实验研究.李静雅
- 英汉翻译中习语的处理策略研究.张亚琦
- 以游戏机制为核心的教育游戏在英语语法教学中的设计与实现.李想
- 粉丝文化对翻译的影响—— 以“欧美圈”为例.尹玉珺
- 交互式多媒体技术教程可用性研究.赵寻
- 英国文化词汇的汉译策略——以《不畏艰险》为例.石晨
- TED演讲中说服力的语言层面分析.李琳
- 《戴尔模式》一书中情感词汇的翻译策略.高月
- 基于自适应学习模式的高中英语听力教学研究.宋凌云
- 基于参与主体视角的IT图书翻译出版活动研究——以图灵教育《洞悉数据》的翻译出版为例.刘云涛
- 模糊匹配句段与译者认知努力相关性的研究.张能
2016-05-24
- 面向受限领域的语义分析系统的设计与实现.郝瑞祥
- 基于层级模型的文本分析及其应用.李茹蒙
- 基于“写长法”的英语写作计算辅助技术研究.刘艳珣
- 基于循环神经网络的端对端文本蕴含识别.王旭光
- 基于机器学习的智能英语教材编纂系统.王志伟
- Query切分及其在相关性排序上的应用.卢刘杰
- 面向多轮对话的语句建模研究.薛卉
- 基于语料库的国际汉语学生中文辅助写作系统.胡亮
2016-05-23
- 成分分析和函数主目分析：分歧与融合.刘骁萱
2016-05-22
- L市公安局民警绩效考评提升的研究与应用.席海龙
- 警用标准地址数据管理和服务系统的分析与设计.王一骄
2015-06-10
- 俄语汉译与俄语英译中文化补偿策略对比研究.张欢
2015-06-01
- 技术文档英汉翻译中模糊语的处理方法研究#xB;—以《ViewletBuilder 版本7专业版用户手册》为例.李雨
2015-05-31
- 市场营销中品牌名称的汉译策略——以《富人消费者：奢侈生活方式的营销与销售》为例.李小溪
- 文本类型视角下社会学专著的翻译研究——以Pricing Beauty: The Making of a Fashion Model为例.范冬妮
- 基于语篇特征的句子仿拟英汉翻译研究.何令琪
- 英汉翻译时名词化的翻译策略 -以社科文本《欧洲经济史》为例.蒋纬
- 针对汉学传统服饰类文本的术语回译研究.李文丽
- 操纵视角下中国政治文献法译研究——以《习近平谈治国理政》为例.杨慕
- 从图文关系看纪录片解说词的翻译——以《自然世界》翻译为例.姚传云
- 基于日语单语语料库的中日同形词翻译实务应用研究.李瑞鹏
- 饮食文化研究中特殊词汇的翻译策略—基于《食物发展史》翻译实践.夏锁
- 英语商业广告汉译策略分析 ——以《1852-1958年，百则最优广告》为例.由晓菲
- 海外华侨华人研究中术语翻译探究——以《新美国华侨华人社会：阶级、经济和社会等级》为例.谢润超
- 新媒体环境中新闻编译策略探究——以界面新闻《眨眼间》编译项目为例.王付娇
- 汽车制造现场口译的难点分析及应对策略研究——以北京奔驰汽车有限公司喷漆车间为例.蒋博
- 移情视角下的非文学文本翻译研究——以主题为”Doing Business in China”的多著作翻译实践为例.姜楠
- 种族歧视词汇空缺的翻译策略研究——以《城中之城：密歇根州大急流城的黑人自由斗争》为例.雷阿芳
- 第一人称代词“we”在经济学英语文本中的汉译策略研究—以《经济动态的计算方法》为例.徐翔
- 管理类书籍中祈使句的翻译研究——以《翻译服务管理》为例.连昭
- 插入语的翻译策略研究——以《文化与帝国：数字革命》为例.翁敏
- 基于语篇的英汉译文重构分析—以John F.Kennedy的汉译为例.胡蓉
2015-05-29
- 基于自适应学习模式的英语从句语法教学研究.林毅君
- 基于自适应模式的英语阅读教学研究.吕京
- 基于翻译认知心理的新型机器翻译系统交互界面的研究.林毅超
- 基于语料库的中美时政新闻英语语体特征对比研究.范琳琳
- 可及性视角下英语指示照应的翻译策略——以Institutionalization of UX的汉译为例.符吉聪
- 深度学习在依存分析中的应用.黄苹苹
- 跨文化传播下中国用户对技术文档需求的实证研究——以归纳/演绎结构以及图文关系为例.李倩
- 软件行业英语应用移动学习资源库构建研究——以Project X为例.袁凯
- 机器翻译译后编辑对英汉翻译效率提升研究.张路露
- 中国饮食全球化进程中的菜名英译研究.李秀颖
- 科技博客的语言特点和编译策略研究.孙瑜
- 基于自适应学习模式的大学英语产出性词汇教学研究.徐亮
- 语言服务团队术语管理能力评估.许欣蕾
- 跨文化视角下央企英文社会责任报告文本分析.程千
- 中德现场工程师的跨文化冲突研究——以北京奔驰发动机项目为例.刘天意
- 婚姻研究中文化负载词的汉译策略—以《为婚姻正名》为例.郭皓洁
- 平行文本在社科类著作翻译中的应用——以The Children of Chinatown的翻译为例.祁红坤
- 财经类通俗读物中人称代词的翻译策略——以Brilliant Accounting一书的汉译为例.魏宁
- 互联网技术科普书籍中插图文本的翻译策略.刘雨萌
- 基于人际意义的员工手册翻译策略研究——以 FCA Employee Handbook 2014 为例.何丹
- 威尔斯翻译理论在本地化项目管理学术著作翻译中的应用.张海兰
- 顺应论视角下英语插入语的翻译研究——以 Living and Dying With Cancer为例.马占领
2015-05-28
- 基于条件随机场的用户查询日志中的影视类命名实体识别.李高扬
- 基于认知心理的计算机辅助翻译工具界面探索.郑江锋
- 分布式系统的升级和数据迁移问题研究.黄礼骏
- 基于复述及语义分析的智能问答系统.黎槟华
- 智能电子词典系统的研究和实现.厉海洋
- 基于协同训练的专利文本分类.位明旭
- 结合思维导图的MOOC学习路径设计与应用研究 ——以”计算机辅助翻译原理与实践”课程为例.何美伊
2015-05-19
- 回译在英文技术写作教学中的应用研究.徐彬彬
2014-12-10
- 寻找理论到实践的切入点——谈翻译标准对翻译实践的指导意义.陈巧云
2014-11-28
- 英文科普著作中的隐性连贯及其汉译.周梦洁
- “self-”复合词的翻译研究.刘飞
2014-05-31
- 利用语料库和网络资源解决英译汉的“难译词”问题研究——以马汉著作的汉译为例.徐征
- 英汉IT科普翻译之词汇层面翻译研究——以《傻瓜丛书：无线家庭联网》汉译为例.陈甜甜
- 省力原则指导下的显化翻译研究——以科研文献的翻译为例.马千里
- 英语技术文档中动词文体特征的研究.安妮
- 文献角度下看汉学的写作和翻译——以The Last Empress: The She-Dragon of China译本为例.李卓勋
- 论主位推进与语篇的衔接和连贯及翻译策略——以《职业健康科学：压力、精神生物学与工作新天地》为例.仲婕
- 电影研究中术语的翻译策略—以《银幕上的中国：电影与民族》为例.赵雪艳
- 幽默语气的翻译策略——以《匆匆》为例.刘琼
- 跨文化传播视角下基于读者因素的译文详略处理研究——Chinese Business Etiquette and Culture 翻译报告.李响
- 翻译中的欧化现象及其可接受度的实证研究——Breaking Free的翻译实践报告.王汉江
- 基于语料库的大学生译作的欧化翻译研究.刘倩
- 中式英语在词项搭配层面的表现探析.张涵
- 开放课程特点及对应翻译策略研究——以斯坦福大学《iOS应用开发》为例.崔梦婕
2014-05-30
- 生态翻译视角下交叉学科科普文本中隐喻的汉译策略研究——以《真正的环境危机》为例.鹿桐欣
- 基于语料库统计的英汉连词省译研究——以《苏联语言政策》为例.李金蔓
- 科普翻译准确性与可读性的平衡研究——以《宇宙的100个关键发现》为例.郭萃
- 服务于翻译教学的学习者语料库定量研究.韩林涛
- 《中国服饰变迁》中服饰文化的翻译研究.季梵
- 语境及篇章对经济类文献翻译的指导.关赢
- MOOC与翻转课堂模式结合的课程设计与应用研究——以翻译技术课程为例.陈泽松
- 基于CATTP平台的wiki协作式写作和同伴互评研究——以技术文档写作教学为例.吴燕秋
- 《倔强的土地》一书中专有名词回译策略研究.于亚楠
- IT类视频教程的配音和字幕翻译模式的学习效果研究.欧丽
- 国内语言服务业翻译技术认证考试的内容设计与实证.王聪
- 《远东的灵魂》一书隐喻翻译策略研究——以认知隐喻为视角.刘兴颖
- 中美企业介绍文本的写作研究.顾学军
- 可用性视角下技术文档的翻译研究.焦喜音
- 汉英翻译中植物词汇的处理策略研究.国梦影
- 商务函电模糊词分析及汉译策略研究——以The McGraw-Hill Handbook of More Business Letters为例.杨敏
- 跨文化语境中旅游文本的直译加注策略分析——以《巴基斯坦和喀喇昆仑公路》为例.陈丹丹
- 软件本地化中的UI翻译策略.赵颖豪
- 语境在传记文本翻译中的重要作用——以《纳尔逊传：霍雷肖·纳尔逊的一生和他的传奇故事》为例.卢凤骄
- Never in My Wildest Dreams中隐喻的翻译——基于博弈论视角.朱文佳
- 摄影术语分析及术语库建设.刘溢杰
- 英语隐性逻辑关系的汉译研究——以《信息未来，谁主沉浮？》为例.张诗玲
- 基于语料库的中美高校简介对比研究.刘家良
- 基于威尔逊翻译模式的科普英文中带连字符的复合形容词翻译研究–以《科普天文学》为例.赵晓玮
- 基于归化和异化的文化背景翻译策略研究——以The Big Screen: The Story of the Movies and What They Did to Us为例.夏智琳
- 英汉翻译中的“不译”策略研究——以《言论的权力：生活中的语言政治》为例.何昕
- 第二人称代词在科技英语中的汉译策略研究–以《技术写作101》为例.杨莹
- 移动应用的本地化翻译研究#xB;—以WeChat Android 5.0.3为例.严敏
- 语义韵视角下英语新闻报道中模糊限制语的翻译研究.王蕾
- 纪实文本中情感表达的翻译策略研究——以A Problem from Hell: America and the Age of Genocide一书翻译为例.汪炜
- 英汉翻译时主语重新择定技巧——以《城市绿色增长》为例.孙慧杰
- 医学英汉翻译实践中模糊语的处理——以《营养学——知识与运用》为例.王崇毅
2014-05-29
- 结构化稀疏方法在语法自动纠错中的应用.李欢
- 网络广告关键词的流量预估.张健
- 基于言语行为的特征选择与分类.崔小薇
- 面向中文问答及对话系统的复述技术研究.张博
- 英语作文介冠词自动改错研究.肖凤霞
- 英文作文自动评分算法研究及系统实现.刘建阳
- 面向团购服务的推荐算法的研究与实践.洪春晓
- 金融业网页采集系统的研究与实现.周舵
- 面向科技文献的语义标注平台的研究与开发.张子渊
2014-01-04
- 功能主义翻译理论和读者反应理论视角下的英文简历翻译实践.陈磊
2013-11-30
- 《奥巴马族》一书中定语从句的翻译技巧.张敏
- 英汉翻译中的语序调整——以《如何为傻瓜工作》汉译为例.韩华
- 基于语料库与文本分析的《老子》英译本比较研究.牛佳玥
- 科普翻译中分译与合译的全信息视角——以《人类本性的科学》翻译实践为例.沈威杰
2013-06-08
- 翻译视角下的双语词典研究与设计.方舟
- 技术说明书的易读性研究.杨涵舒
- 虚拟翻译团队绩效问题研究.曹达钦
- 昆曲翻译与英文诗歌的互文性——以李林德《牡丹亭》译本为例.卢伟
- 殖民游记作品中殖民话语的翻译策略——以Tales of Travel All Around the World一书的翻译为例.邵晶晶
- 基于语料库的技术文档模糊限制语使用研究.何京燕
- 中外政府网站招商引资文本对比研究.范平
- 《加菲猫》漫画的翻译研究.李蕾
- 目的论视角下《盗墓笔记》英译研究.汪林
- 帮助学习者英语表达的英汉电子学习词典的设计与实现.赵梦初
- 不同媒体下的语言特点及对应翻译策略的研究——以《魔戒》为例.张昀
- 基于客户投诉的酒店职业英语培训设计.王芳
- “熟词生义”现象及面向中国英语学习者的在线词典改进设计.瞿乔
- 基于阅读策略的英语专业阅读课程教学设计.尤春丽
- 翻译团队激励机制#xB;对提升翻译项目质量的研究.潘婧
- 游戏翻译研究——以掌机游戏为例.张寅
- 中美英军事新闻英语语体特征对比研究.许文锋
- 人物专访的译注问题——以 100 New Yorkers of the 1970s 为例.曹广慧
- 商业翻译团队成员人格特质对团队绩效的影响研究.潘媛
- 从景观翻译看旅游文体的翻译美感体现.玉薇敏
- 小说中宗教文化的翻译策略研究——以《This Time Forever》中译为例.官毅
- 译者主体性下的通俗读物语言翻译.关怡然
- 在线同伴反馈翻译教学研究.赵玉涛
- 软件用户手册的英译汉研究—以《Wordsmith Tools Manual-Version 6.0》翻译工作为例.孙俊方
- “零度”视角下对涉及道德问题的文本的翻译.苏畅
- 商务英语文体风格三因素分析法及翻译应用.张萌
- 功能对等理论视角下的英语长句汉译策略研究.陆遥
- 英汉翻译中语篇逻辑连接的实例研究.王倩
- 体育传记文本中方式副词的变译方法探究——以 Red Men: Liverpool Football Club – The Biography 中译为例.周游
- 从文化预设的角度探讨励志类文本的英汉翻译策略——以《Make More, Worry Less》英汉翻译为例.王方圆
2013-06-07
- 面向信息处理的现代汉语数词及常用涉数结构研究.颜秦进
- 面向信息处理的现代汉语时间表达研究.王雅慢
- 搜索广告中不相关广告识别算法的研究.吴春煦
- 俄汉辅助翻译平台设计与实现.勾一博
- 基于图论和进化博弈论的聚类算法研究与应用.毕超
- 基于机器学习的Twitter名人分类研究.辛洁
- 在线收藏夹标签推荐服务的设计与实现.陈慧挺
- 基于主动学习策略的中文名址切分标注研究.喻洁琼
- 基于Neo4j的地名本体构建研究.杨洁
- 基于抽象语法树的程序形式化转换研究.马玉超
- 面向交互式机器翻译的汉英平行文本Chunk对齐.吴胜兰
- 疾病自动编码系统的研究与开发.黄家驹
- 基于WEB的语义元数据辅助构建平台关键技术研究与实现.郑德举
- 领域本体在线辅助构建系统的研究与开发.郭志军
- 面向技术创新的铝业本体自动构建研究.王明程
- 面向专利文献的中文句法分析与错误检测研究.李艳萍
- 基于WEB的多领域语料标注加工系统的设计与开发.肖铮
- 面向中学英语教学的知识库自动构建研究.黄毅
2013-06-06
- 科普书籍英译汉中注释的研究.杨德林
2013-05-31
- 交互式写作（翻译）辅助系统.刘强
2012-12-10
- 武侠小说中粗话的英译——以《鹿鼎记》为例.康宇
2012-12-08
- 交传口译的任务型自主学习研究.吴桂兰
- 修辞学视角下的英文产品宣传册劝说功能研究.缪行
- E-learning在翻译公司的应用和探究.李艺峰
- 基于统计的《红楼梦》两个俄译本诗歌翻译风格分析和比较.管佳珏
- 用户文档的可用性研究—以智能手机用户手册为例.凌昱
- E-learning产品本地化翻译及可读性研究.王培方
- 演示文稿翻译策略研究.王薇
2012-06-04
- 一种改进的动态网络最短路径算法.季海坤
2012-06-03
- 基于语料库的互联网科普作品欧化翻译研究.唐舒芳
2012-06-02
- 基于模因论视角的流行语翻译.赵阳
- 英文网络游戏语体特征的多维度分析.苏霄
- 餐饮推荐系统的设计与实现.孔令恺
- 在翻译工具辅助下的翻译过程实证性研究.刘小雨
- 语料库驱动的中英旅游英文官网语体差异分析.雷雯霆
- 基于LDA模型的博客主题提取.王珍
- 基于英语单语语料库的英汉翻译实务应用研究.李南哲
- 中医对外援助工作的翻译研究及术语库建设.范慧阳
- 基于主题模型的全宋词语料库构建以及计算机辅助宋词创作研究.黄子轩
- 一种对外汉语教学辅助软件的开发.韩捷
- 基于语料库的二本院校非英语专业学生写作现状研究.赵巍
2011-12-10
- 面向学术论文计算机辅助翻译的受限汉语研究.王雷
2011-11-28
- 本地化用户文档系统功能分析及其翻译.张凤
- 名化在科技语篇汉英翻译中的应用.杨倩
- 面向国际汉语教学的成语应用偏误研究及成语学习知识库的设计与建设.张一宁
- 英汉语篇衔接及其翻译.刘劲松
2011-11-28
- 日语三字词的探究——以报纸语料为例.赵玲莉
- 手机行业基于用户行为的意图发现.刘先兵
- 构建面向用户的软件本地化翻译质量评估体系.李晓英
- 视频台标检测技术的研究与实现.李铁
2011-11-24
- 激励制度、工作特性与本地化工程师离职倾向的关系研究——以H本地化公司为例.孙鹏亮
2011-06-05
- 古汉语文本自动句读研究.许京奕
2011-06-04
- 社交网络用户网页基于语义排序的关键词抽取技术.李春旭
2011-06-03
- 基于语言模型的汉英多词表达互译对自动提取研究.朱莎莎
- 中文网页褒贬评价系统的设计与实现.董洁
- 翻译公司品牌建设模式研究.赵凰吕
- 翻译企业管理优化及商业智能研究.李坤
- 虚拟物品个性化推荐系统设计与实现.李波
- 基于AHP的供应商评价模型及其应用研究.张望
2011-06-02
- 读者主体视角下国内技术文档写作与翻译研究.朱鹏飞
- 项目协作式翻译教学网络辅助平台的研究与设计.徐庆
- 语言和流程角度的用户手册创作研究.孙晓东
- 基于交际翻译理论的技术文档本地化.苏昊明
- 运用翻译技术实施医学翻译教学改革.孟阳子
- 关联理论下用户手册翻译研究——以打印机用户手册翻译为例.徐尧
- 软件本地化项目风险管理解决方案.杨丽霞
- 计算机辅助翻译工具的测评框架与测评.高志军
- 基于动词模式匹配的英语写作自动批改的研究与实现.靳光洒
- 软件本地化中标点符号的翻译策略.黄河
2010-12-01
- 计算机辅助的本地化翻译质量检查系统研究与实现.丁矗
- 基于语义图和半指导学习方法的关键词获取技术研究.李德聪
- 搜索引擎查询建议系统的设计与实现.母亦翔
- 敏捷开发环境下英文用户文档开发研究.金坤
- 基于语料库的莎士比亚戏剧中译本译者风格研究.李小怡
2010-09-01
- 生物序列比对的概率图算法在并行通用处理器上的设计与实现.尹朝明
2010-06-12
- 以均根匀度为中心的语言信息计量研究.张化瑞
2010-06-04
- 汉语文本中的隐喻计算研究.贾玉祥
2010-06-03
- 新闻自动提取系统的设计与实现.彭必扬
2010-06-02
- 计算机辅助翻译专业的课程研究与改革.韩依彤
- 主位述位理论在软件翻译中的应用.高辉
- 本科英语专业翻译教学模式的改进研究.温彬
- 使用多媒体语料库改善英语交际能力教学的研究.刘晟
- 面向多语言服务平台的术语管理研究.李昕玥
2010-06-01
- 国内翻译公司的推广策略研究——以元培翻译公司为例.陈庆
- 计算机辅助翻译技术在公关行业双语资料管理中的应用.尹静
- 用信息化管理提升翻译企业核心竞争力的研究–以北京元培翻译为例.王晓婷
- 翻译技术课程的教学实践研究.王华树
2010-05-28
- 一个即时通信翻译系统的设计与实现.高天奇
- 文本过滤及其在问题过滤中的应用.李吉
- 基于隐马尔可夫模型的股指预测研究.田健南
- 面向计算机辅助翻译的自动化译前准备研究.张永伟
- 基于用户分类与行为分析的旅游类智慧型搜索用户模型的研究.孟凡亮
- 英语动词性隐喻识别研究.徐建
- 基于用户行为的搜索引擎查询质量评估.孙宇
- T级大规模检索系统和基于Hadoop的分布式检索系统研究.杨威
2010-05-27
- 影视字幕翻译研究与项目管理.陈曌赟
- 高质高效的社交游戏本地化项目研究——以《开心厨房》为例.卢玉莹
- 受控语言在英语技术文档写作中的应用研究.赵晨
- 基于语境理论的移动英语辅助.侯晓琛
- 科技翻译项目的风险管理研究初探——以F汽车维修手册文档的翻译项目为例.王水平
- 英文技术文档语域分析及写作项目研究.刘雪姣
- 基于单语语料库的翻译实务研究—Sketch Engine工具在翻译实践中的运用.王一一
- 基于系统功能语言学的中英翻译质量评估模式研究.林曦
- 基于SNS的用户协作型英汉翻译词典社区的研究与设计.李博
2010-05-19
- 面向概率型词汇知识库建设的名词语言知识获取.王萌
2009-06-08
- 英汉中动结构推导的句法语义界面研究.王珺
- 基于相似文本检测的反恶意文本系统.罗侃
- 基于条件随机场的中文地名和机构名识别.谭大伟
- 基于动态条件随机场的中文命名实体识别.张友书
- SVM和CRF结合下人的头部检测与精确定位.吴桐
2009-06-06
- 汉语的自动词义区分研究.朱虹
- 词义消歧若干关键技术研究.金澎
2009-06-05
- 皮钦语，克里奥尔语与二语习得－－－从皮钦英语到中国英语.邹强
2009-06-01
- 基于Web的汉语巨型辅助教学系统的设计与实现.王更生
- 基于语义相似度计算的汉语词义区分探索.闫国华
2008-12-01
- 基于角色词典的机构名识别.张伟伟
2008-06-12
- 综合型语言知识库系统原型的开发与中文缩略语知识库建设.支流
2007-12-01
- 大学生英语听力学习策略探索——石河子大学个案研究.李蕾
- 语法隐喻对英语教学的启示.陈荣泉
2007-06-13
- 英语Be-existential plus Relative结构与含义浅析.朱希滨
- 面向文本聚类的相似度计算方法研究.王洪俊
2007-06-12
- 最简方案理论框架下的英汉被动式句法研究.明小天
2006-12-14
- 面向问答系统的情感倾向分析研究.苏祺
- 面向领域本体进化的术语提取与术语层次关系发现.何燕
2006-06-14
- 《现代汉语语法信息词典》管理平台的设计开发和地名库建设.王媛媛
2006-06-11
- 基于错误驱动的术语间概念关系自动提取技术研究与实现.崔高颖
- 基于支持向量机的汉语词义消歧研究.幸运
- 用户兴趣引导下的网页收集研究.项锟
2006-06-09
- 汉语名词短语隐喻识别研究.王治敏
- 面向中文专著的汉韩机器辅助翻译研究.姜柄圭
2005-06-10
- 统计与规则相结合的粗粒度词义消歧软件的设计与实现.温珍珊
- 中文搜索结果的在线层次聚类技术.赖治国
- 中文术语自动提取技术研究.谌贻荣
2005-06-09
- 论汉英动结式.和平
2005-06-08
- 数学基础研究与转换生成语法的发展.沙陶金
2004-05-23
- 汉英机器翻译若干关键技术研究.刘群
2004-05-20
- 面向中文学术专著的机器辅助翻译研究.柏晓静
- 基于实体属性的中文网页检索研究.昝红英
2004-05-19
- 中文网页褒贬态度的机器评价.苏玉梅
- 基于PAT-Tree的领域关键词自动提取.吴拥华
2003-05-01
- 书面藏语语法信息知识库的设计与应用研究.陈玉忠
- 汉语新闻报道中的话题跟踪与识别研究.李保利
2001-05-01
- 语用失误及其对英语教学的启示.梁润生
- 关于常见错误“他”和“她”混用的改正.王巧云
- 名词性短语的形式化研究.刘鸿勇
- 现代汉语非受限文本的实语块分析.孙宏林
- 基于词汇语义分析的唐宋诗计算机辅助深层研究.胡俊峰
2000-06-01
- 继承——归纳机制及其在对象系统和信息提取技术中的应用.孙斌
2000-05-01
- 双语语料库的 XML 表示及其自动分类方法研究.程兆炜
1999-06-01
- 非确定性函数类与 SAT 的结构性质.刘田
1999-05-01
- 汉英机器翻译中的基于实例的转换引擎研究.常宝宝
1997-06-01
- 古诗研究的计算机支持系统和相关的计算语言学课题.沈钢
1996-01-01
- “古诗研究的计算机支持环境”的设计与实现.刘岩斌
1995-06-01
- 面向机器翻译的汉语句法规则和自动分析.陶晓明
1994-05-01
- 日汉机器翻译模型系统的实现和词汇功能语法的应用.刘东
1993-06-01
- 现代汉语语料库多级处理与汉语短语结构分析.周强
1991-06-01
- 一个定点驱动的双向句法分析器.毛少伟
1990-06-01
- 日汉机器翻译的研究和模型系统的实现.陈华

2019-05-30

基于深度学习的自动句法纠错研究.黄浩洋

链接

题名：	基于深度学习的自动句法纠错研究
姓名：	黄浩洋
学号：	1601210559
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-30
外文题名：	Deep learning based automatic grammer error correction
关键词：	自然语言处理自动语法改错平行语料深度学习预训练
外文关键词：	Natural language processing Automatic grammar correction Parallel corpus Deep learning Pre-training
论文摘要：	︿自动语法改错(GEC),是自然语言处理中句法分析中较为困难的任务之一。在日常对话中，语法上的细微差别对于一个非母语的人来说是最困难掌握与理解的，当前自然语言中的语法改错不仅包含语法错误，也包含拼写与搭配错误。近年来，随着深度学习的发展，自动语法改错任务得到了不少关注。基于统计机器翻译(SMT)的短语相关方法，是将GEC 看做一个翻译任务：从“坏”转换到“好”，所用的语料也是类似翻译语料的平行语料。不同于SMT 依赖于递归神经网络（RNN），也有通过卷积神经网络（CNN）来进行句子编码，提取以短语为基础的语义空间表征。这些方法都是通过建立端到端（encoder-decoder）的序列到序列（seq2seq）模型,理解错误句子与正确句子之间的语义以及词语表述的差异来定位语法错误。为了进一步充分学习数据中的知识，通过监督学习（supervised learning）方式是最常见的。该方法需要大量标注数据，但是标注成本巨大。学者们发现可以利用非标注（unlabeled）数据进行非监督学习,通过挖掘其中有价值语义信息帮助其他的监督任务理解。其中有利用基于翻译语料的预训练模型,也有利用长文本语料进行语言模型的预训练,还有利用多任务结合的泛化性预训练模型。这些预训练模型都在许多任务上经过检验，可以对模型表现有很大的提升。虽然自动改错模型可以借助比较新颖的模型架构，但是由于自动改错语料的缺失，更大范围的自动改错以及具有实际应用价值的自动改错模型建设依然不理想。而本次研究不仅提出了一种新的堆叠模型结构，同时该结构可将预训练的丰富语义信息的特征嵌入，得到一种可适配多种预训练方法的多层自动纠错模型。模型不仅可以进行多轮迭代解决改错难题，同时为了进一步缓解自动改错语料不足，利用了对偶学习方法产生更多额外训练数据。整体纠错框架不仅可以帮助理解词语之间的相关性、短语的连贯性、语义的匹配性，还有句子语法准确性。阶段式的模型结构，使得模块能高度可替换且可扩充。同时目前已经开源平行纠错语料以及实际改错样例表明，该模型不仅可以在学术数据集取得很不错的效果还能应用到实际场景。本文模型框架还能进一步融合目前最新的预训练模型权值，具有很强的可扩展性，这是其他所有工作所不具备的。使得本次研究更有意义以及未来研究价值。﹀
外文摘要：	︿ Automatic grammar correction (GEC) is one of the most difficult tasks in syntactic analysis in natural language processing. In daily conversations, grammatical nuances are the most difficult to grasp and understand for a non-native speaker. The grammatical corrections in current natural language include not only grammatical errors, but also spelling and collocation errors. In recent years, with the development of deep learning, the task of automatic grammar correction has received a lot of attention. The phrase-related method based on statistical machine translation (SMT) is to regard GEC as a translation task: from "bad" to "good", the corpus used is parallel corpus similar to translation corpus. Different from SMT, which relies on recurrent neural network (RNN), there are also convolutional neural networks (CNN) for sentence coding and extraction of phrase-based semantic spatial representation. These methods locate grammatical errors by establishing an encoder-decoder sequence-to-sequence (seq2seq) model that understands the semantics between erroneous sentences and correct sentences and the differences in word expressions. In order to further fully learn the knowledge in the data, supervised learning is the most common. This method requires a lot of annotation data, but the cost of labeling is huge. Scholars have found that unsupervised learning can be performed using unlabeled data to help other supervisory tasks understand by mining valuable semantic information. Among them are pre-training models based on translation corpus, pre-training using long text corpus, and generalized pre-training model using multi-task. These pre-training models have been tested on many tasks and can greatly improve the performance of the model. Although the automatic error correction model can be based on a relatively new model architecture, the automatic error correction and the automatic error correction model with practical application value are still not ideal due to the lack of automatic error correction corpus. This study not only proposes a new stacked model structure, but also embeds the features of pre-trained rich semantic information to obtain a multi-layer automatic error correction model that can adapt to multiple pre-training methods. The model can not only solve the problem of correcting errors by multiple rounds of iteration, but also use the dual learning method to generate more additional training data in order to further alleviate the lack of automatic error correction corpus. The overall error correction framework can not only help to understand the correlation between words, the coherence of phrases, the matching of semantics, but also the accuracy of sentence grammar. The staged model structure makes the module highly replaceable and expandable. At the same time, the open source parallel error correction corpus and the actual error correction examples show that the model can not only achieve good results in the academic data set but also apply to the actual scene. The model framework of this paper can further integrate the current pre-training model weights, which is highly scalable, which is not available in all other work. Make this study more meaningful and future research value. ﹀
分类号：	TP3
论文总页数：	65
参考文献总数：	50
参考文献列表：	︿ Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. Gers, Felix. Long short-term memory in recurrent neural networks. Diss. Verlag nicht ermittelbar,2001. Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[J]. Computer Science, 2014. Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017. [3] Bengio Y, Schwenk H, Senécal J S, et al. Neural Probabilistic Language Models[J]. Journal of Machine Learning Research, 2003, 3(6):1137-1155. Chowdhury, Gobinda G. Introduction to modern information retrieval. Facet publishing, 2010. Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013). [6] Wang Z, Hamza W, Florian R. Bilateral Multi-Perspective Matching for Natural Language Sentences[J]. 2017. Pennington, Jeffrey, Richard Socher, and Christopher Manning. "Glove: Global vectors for word representation." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. Collobert, Ronan, and Jason Weston. "A unified architecture for natural language processing: Deep neural networks with multitask learning." Proceedings of the 25th international conference on Machine learning. ACM, 2008. Le Q V, Mikolov T. Distributed Representations of Sentences and Documents[J]. 2014, 4:II-1188. Kiros R, Zhu Y, Salakhutdinov R, et al. Skip-Thought Vectors[J]. Computer Science, 2015. Yuan, Zheng, and Ted Briscoe. "Grammatical error correction using neural machine translation." Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016. Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014. Freitag, Markus, and Yaser Al-Onaizan. "Beam search strategies for neural machine translation." arXiv preprint arXiv:1702.01806 (2017). Gu, Jiatao, et al. "Incorporating copying mechanism in sequence-to-sequence learning." arXiv preprint arXiv:1603.06393 (2016). Kaiser, Łukasz, and Samy Bengio. "Can active memory replace attention?." Advances in Neural Information Processing Systems. 2016. Kalchbrenner, Nal, et al. "Neural machine translation in linear time." arXiv preprint arXiv: 1610.10099 (2016). He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016). Radford, Alec, et al. "Improving language understanding by generative pre-training." URL https://s3-us-west-2. amazonaws. com/openai-assets/research-covers/languageunsupervised/language understanding paper. pdf (2018). Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554. Poultney, Christopher, Sumit Chopra, and Yann L. Cun. "Efficient learning of sparse representations with an energy-based model." Advances in neural information processing systems.2007. McCann, Bryan, et al. "Learned in translation: Contextualized word vectors." Advances in Neural Information Processing Systems. 2017. Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018). Kim, Yoon. "Convolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014). Rocktäschel, Tim, et al. "Reasoning about entailment with neural attention." arXiv preprint arXiv:1509.06664 (2015). Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018). Junczys-Dowmunt, Marcin, and Roman Grundkiewicz. "Phrase-based machine translation is state-of-the-art for automatic grammatical error correction." arXiv preprint arXiv:1605.06353 (2016). Chollampatt, Shamil, and Hwee Tou Ng. "A multilayer convolutional encoder-decoder neural network for grammatical error correction." Thirty-Second AAAI Conference on Artificial Intelligence. 2018. Junczys-Dowmunt, Marcin, et al. "Approaching neural grammatical error correction as a low-resource machine translation task." arXiv preprint arXiv:1804.05940 (2018). Sennrich, Rico, Barry Haddow, and Alexandra Birch. "Neural machine translation of rare words with subword units." arXiv preprint arXiv:1508.07909 (2015). Dauphin, Yann N., et al. "Language modeling with gated convolutional networks." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017. Ge, Tao, Furu Wei, and Ming Zhou. "Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study." arXiv preprint arXiv:1807.01270 (2018). Rico Sennrich, Barry Haddow, and Alexandra Birch. Improving neural machine translation models with monolingual data. In ACL, 2016. He, Di, et al. "Dual learning for machine translation." Advances in Neural Information Processing Systems. 2016. Mizumoto, Tomoya, et al. "Mining revision log of language learning SNS for automated Japanese error correction of second language learners." Proceedings of 5th International Joint Conference on Natural Language Processing. 2011. Dahlmeier, Daniel, Hwee Tou Ng, and Siew Mei Wu. "Building a large annotated corpus of learner English: The NUS corpus of learner English." Proceedings of the eighth workshop on innovative use of NLP for building educational applications. 2013. Ng, Hwee Tou, et al. "The CoNLL-2014 shared task on grammatical error correction." Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. 2014. Napoles, Courtney, Keisuke Sakaguchi, and Joel Tetreault. "Jfleg: A fluency corpus and benchmark for grammatical error correction." arXiv preprint arXiv:1702.04066 (2017). Dahlmeier, Daniel, and Hwee Tou Ng. "Better evaluation for grammatical error correction." Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2012. Napoles, Courtney, et al. "Ground truth for grammatical error correction metrics." Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Vol. 2. 2015. Papineni, Kishore, et al. "BLEU: a method for automatic evaluation of machine translation." Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002. Sutskever, Ilya, et al. "On the importance of initialization and momentum in deep learning." ICML (3) 28.1139-1147 (2013): 5. Felice, Mariano, and Zheng Yuan. "Generating artificial errors for grammatical error correction." Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics. 2014. Rozovskaya, Alla, and Dan Roth. "Grammatical error correction: Machine translation and classifiers." Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2016. Junczys-Dowmunt, Marcin, and Roman Grundkiewicz. "The AMU system in the CoNLL-2014 shared task: Grammatical error correction by data-intensive and feature- rich statistical machine translation." Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. 2014. Ji, Jianshu, et al. "A nested attention neural hybrid model for grammatical error correction." arXiv preprint arXiv:1707.02026(2017). Xie, Ziang, et al. "Noising and denoising natural language: Diverse backtranslation for grammar correction." Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Vol. 1.2018. Grundkiewicz, Roman, and Marcin Junczys-Dowmunt. "Near human-level performance in grammatical error correction with hybrid machine translation." arXiv preprint arXiv:1804.05945 (2018). 王建翔. 面向可读性评估的词向量技术研究及实现 [D]. 南京, 中国 : 南京大学, 2017. 宗成庆. 统计自然语言处理 [M]. 北京, 中国 : 清华大学出版社, 2013. ﹀
公开日期：	2019-06-11

2019-05-29

基于自然语言处理的学生英文检错规则抽取研究.杨越

链接

题名：	基于自然语言处理的学生英文检错规则抽取研究
姓名：	杨越
学号：	1601210810
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-29
外文题名：	Research on the Extraction of English Error Detection Rules based on Natural Language Processing
关键词：	关键词：作文检错规则抽取规则匹配自然语言处理
外文关键词：	Composition correction Rules extraction Rules matching Natural language processing
论文摘要：	︿已有二语习得研究表明，提供有效校正反馈，有利于提高第二语言学习者语言水平。目前市面上也出现了一些英语写作的纠错工具，例如国外有LanguageTool、Grammarly，国内有批改网等软件。这些工具大多局限于英文写作中的单词拼写错误、语法错误，当涉及中式英语、搭配错误、句型错误、含义模糊等偏主观错误时，主要依靠人工制定规则进行识别。另外，虽然已有LanguageTool等开源工具可以进行错误识别，但是不能针对规则特点灵活进行适配和更改，且识别速度较慢。针对以上问题，本研究提出利用已标注的学习者语料，从中半自动地抽取检错规则，然后自行开发轻量级的规则匹配器来验证和应用规则。在本研究中，首先提取了英文改错规则。通过对已标注的学习者语料库CLEC和NUCLE进行详尽分析，确定可由程序自动抽取的错误类别；通过Java程序设计算法实现规则的初步提取，并且将抽取结果写入MySQL数据库。并且对抽取之后的规则进行测试和验证，通过人工方式筛选规则。最后合理利用牛津搭配词典、Google Books等语料资源对现有规则进行延伸，以达到通过订正错误来帮助学习者学习英语的目的。其次，笔者设计和实现了轻量级规则匹配器。本匹配器是针对抽取出的规则进行设计，可以对半自动抽取的规则表进行验证，也可证明从学习者语料库中半自动抽取规则的可行性。本研究的成果是通过科学方法从学习者语料库中抽取英文改错规则，识别准确率达90%以上；并且对规则进行了预处理，为后续专家校正提供了可靠依据，减少了时间成本；另外设计并实现了轻量级的规则匹配器，作为LanguageTool的补充，将速度提升30%以上，可以高效处理各种自定义规则。研究表明，此半自动生成规则与应用的方式，提高了效率，节省了人力，能够给英语学习者以帮助。同时此项目具有通用性和易扩展性，对于其他学习者语料库或语料资源，可以很好地进行扩展和进一步研究。﹀
外文摘要：	︿ The studies on the learn of the second language has shown that providing effective correction feedback is beneficial for the learners to develop the ability of learning the second language. At present, there are some error correction tools for English writing on the market, such as the foreign LanguageTools and Grammarly, and the domestic Pigai. org, etc. However, most of these tools are limited to the word spelling errors and grammatical errors in English writing, while the retrieval is mainly relied on manual rules when the rules involve subjective errors, such as chinglish, mismatches, sentence pattern errors and ambiguous meanings and so on. Although the open source tools, such as LanguageTool, etc., can identify the errors, they are not appropriate to adapt and change the rules flexibly. Besides, the recognition speed is slow, too. Regarding the issue above, this study aims to use the annotated learner corpus to extract the the error detection rules semi-automatically, and then develop a lightweight rule matcher to verify and apply the extracted rules. The first part is the extraction of English correction rules. Firstly, the error categories that can be automatically extracted by the program are determined through the detailed systematic analysis of the existing and annotated learner corpus CLEC and NUCLE. Secondly, the initial extraction of the rules is implemented by the design algorithm of Java program and the extraction results are written into the MySQL database. In addition, the rules after extraction are tested and verified, and then filtered manually. Finally, the resources, such as Oxford collocation dictionary and Google books, etc., are used to extend the existing rules, so as to help the learners learn English through correcting errors. The second part is the design and implementation of lightweight rule matcher. This matcher is developed to design the extracted rules and the existing rule base. On the one hand, the semi-automatically extracted rule table can be verified conveniently. On the other hand, the feasibility of the rules that are semi-automatically extracted from the learner corpus can be proved. The result of this research is that English error correction rules can be extracted from the learner corpus through scientific methods, with an accuracy rate of over 90%. Moreover, the rules are preprocessed, which provides a reliable basis for the subsequent experts to perform correction and reduce the time cost. Furthermore, the lightweight rule matcher is designed and implemented, which can be taken as a complement to LanguageTool, making the speed increase more than 30% to efficiently handle various customized rules. The studies have shown that the rules generated semi-automatically and the mode of application can improve efficiency, save manpower and help English learners. At the same time, this project has universality and extensibility, so it can extend and further research the future learner corpora or other resources. ﹀
分类号：	TP3
论文总页数：	60
参考文献总数：	40
参考文献列表：	︿ [1] 赵东阳. 语料库方法与二语习得界面研究综述[J]. 海外英语（上）, 2017(10). [2] 刘蕾, 海娜. 网络英文写作在大学英语教学中的应用研究[J]. 海外英语, 2018. [3] 张加加. 初探中介语理论[J]. 赤子, 2014. [4] Ashwell T. Patterns of Teacher Response to Student Writing in a Multiple-Draft Composition Classroom: Is Content Feedback Followed by Form Feedback the Best Method? [J]. Journal of Second Language Writing, 2000, 9(3):227-257. [5] Scanlon M J. Improving Student Writing Through Multiple Peer Feedback[C]// Frontiers in Education Conference. IEEE, 2013. [6] 周一书. 大学英语写作反馈方式的对比研究[J]. 外语界, 2013(3):87-96. [7] Yu Yang. An Empirical Study on the Effects of Self-Correction Based on the Pigai Network on College EFL Students' Writing Proficiency[A]. 东北亚语言文学与翻译国际学术论坛组委会. Proceedings of the Sixth Northeast Asia International Symposium on Language,Literature and Translation[C].东北亚语言文学与翻译国际学术论坛组委会:辽宁省翻译学会, 2017:6 [8] Smet M J R D, Broekkamp H, Brand-Gruwel S, et al. Effects of electronic outlining on students’ argumentative writing performance[J]. Journal of Computer Assisted Learning, 2011, 27(6):557-574 [9] CORDERS S P. The significance of learner’s error[J]. International Review of Applied Linguistic, 1967 (4) :161-170. [10] 桂诗春. 以语料库为基础的中国学习者英语失误分析的认知模型[J]. 现代外语, 2004, 27(2). [11] 蔡龙权, 戴炜栋. 错误分类的整合[J]. 外语界, 2001(4):52-57. [12] 李悦, 吴敏, 吴桂兴, et al. 基于最大熵模型的介词纠错系统[J]. 计算机系统应用, 2016, 25(1):96-100. [13] 林燕. 基于n-gram的英语文章的自动检查[J]. 信息化建设, 2016(6). [14] 谭咏梅, 杨一枭, 杨林, et al. 基于LSTM和N-gram的ESL文章的语法错误自动纠正方法[J]. 中文信息学报, 2018, v.32(06):24-32. [15] Shei C C, Pain H. An ESL Writer’s Collocational Aid[J]. Computer Assisted Language Learning, 2000, 13(2):167-182. [16] 项炜, 金澎. 大规模语料库上的Stanford和Berkeley句法分析器性能对比分析[J]. 电脑知识与技术, 2013(8). [17] 杨国基, 梁洪峻. 自然语言处理中基于短语结构的语法分析方法[J]. 微处理机, 2009, 30(6):74-77. [18] Marneffe M C D, Manning C D. The Stanford typed dependencies representation[C]// Coling: Workshop on Cross-framework & Cross-domain Parser Evaluation. 2008. [19] Gospodnetic. LUCENE IN ACTION[J]. Action, 2010. [20] NUS Natural Language Processing Group. Data of NUS Corpus of Learner English[DB/OL]. (2014) https://www.comp.nus.edu.sg/~nlp/corpora.html [21] Ng H T, Wu Siewmei, Wu Yuanbin, et al. The Co NLL-2013 shared task on grammatical error correction. Proceedings of the Seventeenth Conference on Computational Natural Language Learning. August 8-9, 2013.1-12. [22] 赵新城. 中国学习者英语作文中的词类失误现象分析——一项基于中国学习者英语语料库的实证调查[J]. 北京第二外国语学院学报, 2008, 31(8). [23] 杨惠中. 基于CLEC语料库的中国学习者英语分析[M]. 上海外语教育出版社, 2005. [24] Mi kowski M. Developing an open-source, rule-based proofreading tool[J]. Software-Practice and Experience, 2010, 40 (7) :543-566. [25] 姜赢, 曾杰, 林启红, et al. LanguageTool中文语法校对XML规则定制方法[J]. 图书情报工作, 2014, 58(5):86-92. [26] 王秀娟. 文本检索中若干问题研究[D]. 北京邮电大学, 2006. [27] 文继军, 王珊. SEEKER:基于关键词的关系数据库信息检索[J]. 软件学报, 2005, 16(7):1270-1281. [28] Oxford University press. The British National Corpus (BNC) [DB/OL]. (1980s-1990s) https://corpus.byu.edu/bnc/. [29] 李小撒, 王文宇. WordNet与BNC介入下的第二语言心理词汇联系模式实证研究[J]. 语言科学, 2016, 15(1). [30] 葛诗利. 面向大学英语教学的通用计算机作文评分和反馈方法研究[M]. 上海外语教育出版社, 2015. [31] Islam A, Inkpen D, Islam A, et al. Real-word spelling correction using Google Web IT 3-grams[C]// International Conference on Natural Language Processing & Knowledge Engineering. IEEE, 2009. [32] JonathanCrowther. 牛津英语搭配词典[M]. 外语教学与研究出版社, 2003. [33] 叶莹, 徐海女. 英语学习型词典搭配信息表征的创新趋势研究——以《牛津高阶英语词典》(1-8版)为例[J]. 辞书研究, 2014(6):45-51. [34] Mark Davies. Collocates data[DB/OL]. (2018). https://www.collocates.info/purchase_iweb.asp. [35] Fellbaum C. WordNet[M]// The Encyclopedia of Applied Linguistics. 2012. [36] 李少锋, Ellis R, 束定芳. 纠错反馈时机对不同二语水平学习者的教学效果研究（英文）[J]. 外语与外语教学, 2016(1):1-14. [37] 麻秀丽. “错误提示”英语写作教学法研究[J]. 中国教育学刊, 2013(s2):57-58. [38] 李奕华. 基于动态评估理论的英语写作反馈方式比较研究[J]. 外语界, 2015(3):59-67. [39] Powell H. Teaching Language: From Grammar to Grammaring[J]. Tesol Quarterly, 2012, 38(1):172-173. [40] 朱晔. 反馈信息与知识状态的互动与效果[J]. 现代外语, 2014(2). ﹀
公开日期：	2019-06-14

基于深度学习的视频行为识别研究.常志勇

链接

题名：	基于深度学习的视频行为识别研究
作者：	常志勇
学号：	1601210438
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	计算机科学技术研究所
答辩日期：	2019-05-29
关键字(中文)：	行为识别深度学习全局机制局部密集连接
文摘：	︿近年来互联技术逐渐变得成熟，尤其是智能手机和一些数码设备的普及，令网络上覆盖着大量的视频信息，面对急剧增长的视频数量，一些含有暴力和色情的视频内容被肆意传播，这给青少年的身心健康带来了一定的危害，并且也给网络的监管带来了巨大的压力。由于监控视频数量不断地增长，互联网上视频数量的持续增长令人们对视频内容的理解以及视频中人体行为分析的需求也在不断地增加。使用计算机不仅能够更好地理解视频中的内容，而且能够避免人们花费大量的时间对视频进行分析。深度学习在计算机视觉领域做出了很大的贡献。将深度神经网络在大规模数据集上进行训练使得深度学习方法在目标检测，图像分类和视频中的人体动作识别等领域都达到了较好的效果。由于深度学习对图像数据具有很好地抽象建模能力以及能够自动提取图像特征，而视频可看成是一系列的图像帧堆叠而成。所以对于本文研究的对视频中的人物行为进行识别的技术采用深度学习方法来进行探索。本文的主要工作内容如下: 现有的基于双流卷积网络的行为识别方法中用的卷机网络大部分是 BN-Inception 结构或者是 VGG 结构，这样的结构参数量较大不易于网络的训练，因此本文采用 Densenet 结构来分别提取视频的空间信息和时间信息，原有的 Densenet 结构采用的是全局的密集连接方式，即网络中的 Dense 块中的某一层都与其它层互相连接，这样容易造成特征冗余且参数量较大，并且由于每一层的输入都是之前所有层输出的特征映射的拼接，所以在网络的前向传播和反向传播的过程中都要存储这些中间层的特征映射，所以原有的 Densenet 在训练过程中占有的内存较大。本文针对上述问题，对原有的 Densenet 做出改进参首先将原有的 Densenet 中的每一层互相连接改成局部连接，也就是每一层只与之前的一些层部分连接，这大大减少了模型在学习过程中需要训练的参数量。并且采用共享内存的方式减少模型占有的内存。其次，现有的双流卷机网络最后再对人体行为进行预测时是将两个网络的结果加权平均，这样没有更好的利用视频的时空信息，所以本文通过将提取到的视频信息在空间维度和时间维度上进行合并。﹀
分类号：	TP3
论文总页数：	53
参考文献数：	46
参考文献：	︿ [1] heng wang, alexander kläser, cordelia schmid,等. dense trajectories and motion boundary deors for action recognition[j]. international journal of computer vision, 2013, 103(1):60-79. [2] pun t, pun t. a new method for gray-level picture threshold using the entropy of the histogram[j]. signal processing, 1985, 29(3):223-237. [3] dalal n, triggs b, schmid c. human detection using oriented histograms of flow and appearance[j]. 2006. [4] wang h , klaser a , schmid c , et al. action recognition by dense trajectories[j]. proceedings / cvpr, ieee computer society conference on computer vision and pattern recognition. ieee computer society conference on computer vision and pattern recognition, 2011. [5] shu z , yun k , samaras d . action detection with improved dense trajectories and sliding window[j]. 2014. [6] mironici, du i c, ionescu b, et al. a modified vector of locally aggregated deors approach for fast video classification[j]. multimedia tools & applications, 2016, 75(15):9045-9072. [7] tirilly p, claveau v, gros p. language modeling for bag-of-visual words image categorization[c]// 2008. [8] mika s, ratsch g, weston j, et al. fisher discriminant analysis with kernels[c]// neural networks for signal processing ix, ieee signal processing society workshop. 2002. [9] bicego, manuele, lagorio, et al. on the use of sift features for face authentication[c]// computer vision & pattern recognition workshop. 2006. [10] suykens j a k . support vector machines: a nonlinear modelling and control perspective.[j]. european journal of control, 2001, 7(2-3):311-327. [11] lv f, nevatia r. recognition and segmentation of 3-d human action using hmm and multi-class adaboost[m]// computer vision – eccv 2006. 2006. [12] karpathy a , toderici g , shetty s , large-scale video classification with convolutional neural networks[c]// 2014 ieee conference on computer vision and pattern recognition (cvpr). ieee, 2014. [13] donahue j , hendricks l a , guadarrama s , et al. long-term recurrent convolutional networks for visual recognition and deion[m]// ab initto calculation of the structures and properties of molecules /. elsevier, 2015. [14] simonyan k , zisserman a . two-stream convolutional networks for action recognition in videos[j]. 2014. [15] carreira j, zisserman a. quo vadis, action recognition? a new model and the kinetics dataset[j]. 2018. [16] wang l , xiong y , wang z , et al. temporal segment networks for action recognition in videos[j]. 2017. [17] ioffe s , szegedy c . batch normalization: accelerating deep network training by reducing internal covariate shift[c]// international conference on international conference on machine learning. jmlr.org, 2015. [18]soomro k , zamir a r , shah m . ucf101: a dataset of 101 human actions classes from videos in the wild[j]. computer science, 2012. [19] kuehne h , jhuang h , garrote e , et al. [ieee 2011 ieee international conference on computer vision (iccv) - barcelona, spain (2011.11.6-2011.11.13)] 2011 international conference on computer vision - hmdb: a large video database for human motion recognition[c]// ieee international conference on computer vision. dblp, 2011:2556-2563. [20] zhu j , arbor a , hastie t . multi-class adaboost[j]. statistics & its interface, 2006, 2(3):349-360. [21] deng j , dong w , socher r , et al. imagenet: a large-scale hierarchical image database[c]// 2009 ieee conference on computer vision and pattern recognition. ieee, 2009. [22] hinton g e . rectified linear units improve restricted boltzmann machines vinod nair[c]// international conference on international conference on machine learning. omnipress, 2010. [23] clevert, djork-arné, unterthiner t , hochreiter s . fast and accurate deep network learning by exponential linear units (elus)[j]. computer science, 2015. [24] he k , zhang x , ren s , et al. deep residual learning for image recognition[j]. 2015. [25] x. glorot and y. bengio. understanding the difficulty of training deep feedforward neural networks. in aistats, 2010. [26] hochreiter s, schmidhuber j. long short-term memory[j]. neural computation, 1997, 9(8):1735-1780. [27] a. krizhevsky, i. sutskever, and g. hinton. imagenet classification with deep convolutional neural networks. in nips, 2012. [28] m.lin,q.chen,ands.yan.network in network.arxiv:1312.4400, 2013. [29] szegedy c , liu w , jia y , et al. going deeper with convolutions[c]// 2015 ieee conference on computer vision and pattern recognition (cvpr). ieee, 2015. [30] k. simonyan and a. zisserman. very deep convolutional networks for large-scale image recognition. in iclr, 2015. [31] huang g, liu z, laurens v d m, et al. densely connected convolutional networks[j]. 2016. [32] szegedy c, ioffe s, vanhoucke v, et al. inception-v4, inception-resnet and the impact of residual connections on learning[j]. 2016. [33] szegedy c , vanhoucke v , ioffe s , et al. [ieee 2016 ieee conference on computer vision and pattern recognition (cvpr) - las vegas, nv, usa (2016.6.27-2016.6.30)] 2016 ieee conference on computer vision and pattern recognition (cvpr) - rethinking the inception architecture for computer vision[j]. 2016:2818-2826. [34] ji s , xu w , yang m , et al. 3d convolutional neural networks for human action recognition[j]. ieee transactions on pattern analysis and machine intelligence, 2013, 35(1):221-231. [35] zeiler m d , fergus r . stochastic pooling for regularization of deep convolutional neural networks[j]. eprint arxiv, 2013. [36] yue k , xu f , yu j . shallow and wide fractional max-pooling network for image classification[j]. neural computing and applications, 2017. [37] srivastava n, hinton g, krizhevsky a, et al. dropout: a simple way to prevent neural networks from overfitting[j]. journal of machine learning research, 2014, 15(1):1929-1958. [38] brox t, bruhn a, papenberg n, et al. high accuracy optical flow estimation based on a theory for warping[m]// computer vision - eccv 2004. 2004. [39] schuster m , paliwal k k . bidirectional recurrent neural networks[j]. ieee transactions on signal processing, 1997, 45(11):2673-2681. [40] Lécun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324. [41] Vinyals O, Toshev A, Bengio S, et al. Show and tell: A neural image caption generator[C]// IEEE Conference on Computer Vision & Pattern Recognition. 2015. [42] Bahdanau D, Chorowski J, Serdyuk D, et al. End-to-End Attention-based Large Vocabulary Speech Recognition[J]. Computer Science, 2015:4945-4949. [43] Guo H, Wu X, Wei F. Multi-stream Deep Networks for Human Action Classification with Sequential Tensor Decomposition[J]. Signal Processing, 2017:S0165168417301937. [44] Srivastava R K, Greff K, Schmidhuber J. Highway Networks[J]. Computer Science, 2015. [45] Sermanet P, Eigen D, Zhang X, et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks[J]. Eprint Arxiv, 2013. [46]Zhou B , Khosla A , Lapedriza A , et al. Learning Deep Features for Discriminative Localization[J]. 2015. ﹀
公开日期：	2022-06-11

辅助写作的语料库查询系统设计与实现.胡盖蕾

链接

题名：	辅助写作的语料库查询系统设计与实现
姓名：	胡盖蕾
学号：	1601210568
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-29
外文题名：	Design and Implementation of an Corpus Query System for Writing Assistance
关键词：	语料库辅助阅读辅助写作语料查询系统设计与实现
外文关键词：	Corpus Assisting Reading Assisting Writing Corpus Query System Design And Implementation
论文摘要：	︿英语写作是国内学生英语能力的短板。目前，国内学校开设的英语写作公共课程的教学效果相对有限，市面上虽然有针对英语写作的学习网站和书籍，但大多都是模板句型等资源的汇集展示，向学习者即时提供的指导不具备针对性。各类英语写作辅助工具主要提供作文的机器评阅和自动打分，仅能帮助用户发现作文中的常见错误，对于正确的内容无法予以改进指导。对写作学习者来说，参考已有的专业或优秀的行文表达是写作学习的有效途径，语料库作为真实的语言资源知识库可以在写作学习及教学中提供可信赖的指导意见。但现阶段语料库的设计大部分面向学术研究，普通的教师和学生使用起来并不方便，具体表现在：(1)功能繁多、查询参数的设置较为复杂，用户学习使用的成本较高；(2)经常会出现前几条检索结果是长难句的情况，用户的阅读体验不佳；(3)查询结果中难免包含复杂的词汇和语句，对于普通学习者来说会有一定的阅读负担。本研究将从辅助英语写作的角度出发，面向普通学生和老师设计并实现一个语料库查询系统，使用户可以便捷有效地获取语料信息、并利用语料库查询手段帮助发现和改正作文中的错误及不足。本文首先期望解决目前语料库查询系统对于普通英语学习者的易用性问题，包括功能使用不便和语料阅读困难。本系统针对这些问题实现了语料库基础检索模块、检索结果重构模块和句子辅助阅读模块。基础检索模块提供常用的语料库查询功能，且具备简易查询模式，可以提升语料库使用的便捷性；检索结果重构模块的主要目的在于提升语料阅读体验，将对例句按照从易到难的顺序进行排序，并对查询结果中的用户陌生词汇进行特殊显示；句子辅助阅读模块旨在帮助用户习得语料，将提供句子的机器翻译结果、句法拆解结果和简单句等信息。本文接下来研究了语料库查询在写作场景中的应用，着眼于解决国内学生在英文写作中常出现的词汇搭配僵化、搭配表达偏口语化等具体问题实现了语料库查询辅助写作/批改模块。该模块提供了作文搭配抽取、搭配丰富程度分析和搭配校验三个功能，可以利用语料库数据来帮助用户审阅与改进作文中的搭配使用。经测试，检索结果重构有效提高了用户对语料的阅读兴趣；句子辅助阅读在长难句理解方面的辅助效果得到了被试者的一致认可；语料库辅助写作可以实际改善作文中词汇搭配错误及搭配重复使用的问题，在对10篇作文进行修改后，作文在批改网的得分平均提高了1.9分（满分100），最高提升了5分。﹀
外文摘要：	︿ English writing is a major problem for Chinese students. English writing classes opened by domestic schools have relatively limited effects. Although there are websites and books aiming to help students write English articles, most of them are simply the displays of resources like sentence patterns. They do not provide users with adaptive instructions. Assisted tools for English writing mostly can not instruct learners to reach a higher level of English writing as they can not provide guidance on right sentences, all they do is examining articles, picking out common mistakes and offering grades. Referring to idiomatic articles is a valid way for learners to improve writing ability. Corpus, being a data bank of language, can provide reliable guidance for writing learning and teaching. But the corpuses nowadays are mostly designed for academic purposes which do not fit the goals of common teachers and students. Actually, teachers and students may find such corpuses inconvenient when using them because: 1) They have redundant functions and complicated query parameter settings so that they are hard for users to learn. 2) It is often the case that the first few search results are long and difficult, which gives users a hard time in reading. 3) Complex words and sentences in the query results can be a burden for average learners. Motivated by the idea of assisting English writing, this study designs and implements a corpus query system for ordinary students and teachers Using the system, users can easily obtain corpus information and use corpus query methods to help identify and improve deficiencies in the composition. This article wishes to tackle the usability problems including the inconvenience in using as well as reading corpus data. To overcome these obstacles, the system is designed with basic query module, query results automatic reconstructing module and assisting sentence reading module. Basic query module was embedded with simple query mode which makes the corpus more convenient to use. Result reconstruction module aims at enhancing reading experience, it can reconstruct query results by sorting the example sentences in order of difficulty and highlight the unfamiliar words. Assisting sentence reading module provides syntactic splitting results (including clause recognition and collocation), machine translation results and information of simple sentences etc. This study also explores the application of corpus query in English writing. Towards the issue of lexical fossilization and colloquialism of Chinese students, the author develops the corpus assisting writing/examing module to help them write articles by querying corpus. This module not only extracts collocations but also analyzes their richness and verifies them. It helps users examine and improve their use of collocation. Test shows that the result reconstruction validly boost users’ interest in reading the corpus query result. The assisted sentence reading module was widely welcomed by the subjects when tested on understanding the long difficult sentences. The assisted writing module improves the use of collocation in articles. On average, 10 revised articles graded on Pigaiwang (a website provides grading module) are elevated by 1.9 points (full credit 100) after using the system for modification. Most points elevated is 5. ﹀
分类号：	TP3
论文总页数：	106
参考文献总数：	86
参考文献列表：	︿冯展极, 周萍, 张丽杰著. 方法论视域下的英语教学新探[M].2017. 牛洁珍. 基于现代信息技术的大学生英语写作能力培养研究[M].2016. 刘荣君,张虹,王娜. 信息技术支持的大学生英语写作能力培养的实证研究[J].电化教育研究,2014,35(05):82-86+113. 蒋学清,蔡静,唐锦兰. 探析自动作文评分系统对大学生英语写作能力发展的影响[J].山东外语教学,2011,(6):36-43. 笪冬梅. 大学英语英语写作错误分析[J].现代经济信息,2018(19):407+409. 高文文. 大学英语写作中系统错误的识别和纠正[D].西安外国语大学,2018. 梁彪. 面向英语智能学习的知识库系统的设计与实现[D].北京大学.2018. 赵晓平,王巧宁. 在线语料库对中国英语学习者写作发展的影响[J].语言教育,2018,6(04):16-22. 林立,董启明. 语言学与应用语言学研究[M].2005. 刘喜琴. 语料库辅助EFL自主学习的多维探索[M].2013. 朱越峰. 英语教育词汇学[M].2015. 许智坚. 计算机辅助英语教学[M].2015. 万丽芳. 中国英语专业大学生二语写作中的词汇丰富性研究[J].外语界,2010(1):40-46．张金福. 基于美国当代英语语料库对中国学生英语作文中词汇应用能力研究[D].上海外国语大学,2012. 郑丽洁.小文本语料库在Hadoop平台上的存储策略研究[D].华中师范大学,2014. 郑通涛,曾小燕. 大数据时代的汉语中介语语料库建设[J].厦门大学学报(哲学社会科学版),2016(02):53-63. 荀恩东,饶高琦,肖晓悦,臧娇娇. 大数据背景下BCC语料库的研制[J].语料库语言学,2016,3(01):93-109+118. 曾用强. 基于语料库的适应性学习模式[J].现代外语,2001(03):268-275+267. 宋丽珏. 人工智能时代语料库短语学考察[J].学习与探索,2017(12):78-85. 牛桂玲. 中外学术论文中英文摘要语料库的创建及应用[M].2013. 冯正斌,王峰. 财经英语新闻语料库的建设构想与教学应用[J].外语电化教学,2016(02):54-58+39. 张钰莎. 微博主题语料库的设计与实现[J].情报探索,2016(10):65-67. 邓军涛,古煜奎. 口译自主学习语料库建设研究[J].外文研究,2017,5(04):88-93+106-107. 杨林伟. 数字时代下的计算机辅助语言教学理论与实践[M].2015. 郑晶,欧琛. 译学发展与流派研究[M].2015. 王余光,徐雁. 中国阅读大辞典[M].2016. 彭聃龄,张必隐. 认知心理学[M].杭州：浙江教育出版社.2004. 梁宁建. 当代认知心理学[M].2014. 潘云燕,赵天红,张馨编. 迎战710分大学英语四级考试阅读理解突破[M].2007. 孟遥,李生,赵铁军,et al. 基于统计的句法分析技术综述[J].计算机科学,2003,30(9). 徐润华. 基于词语搭配知识和语法功能匹配的句法分析器[D].南京师范大学.2013. 项炜,金澎. 大规模语料库上的Stanford和Berkeley句法分析器性能对比分析[J].电脑知识与技术,2013(8):1984-1986. 马刚. 基于语义的Web数据挖掘[M].大连：东北财经大学出版社.2014:273. 高彦杰,倪亚宇著. Spark大数据分析实战[M].北京：机械工业出版社.2016. 胡吉明. 社会网络环境下基于用户关系的信息推荐服务研究[M].武汉：武汉大学出版社.2015:156-160. 吴思远,蔡建永,于东,江新. 文本可读性的自动分析研究综述[J].中文信息学报,2018,32(12):1-10. 谭文堂. 基于统计模型的汉语句子主干分析[D].国防科学技术大学,2008. 郭艳华,周昌乐. 一种汉语语句依存关系网分析策略与生成算法研究[J].浙大学学报(理学版), 2000,27(6):637-646. 齐浩亮,杨沐昀,孟遥,韩习武,赵铁军. 面向特定领域的汉语句法主干分析[J].中文信息学报,2004,18(1):01-06. 薛永增,杨沐昀,赵铁军,韩习武,齐浩亮. 面向体育领域的句子主干翻译技术研究[J].中文信息学报,2005,19(5):24~31. 许威,赵克,亿珍珍. 一个确定汉语句子主干的递归模型[J].航空计算技术,2008,38(4):66~70. 刘梅彦,黄改娟. 面向信息内容安全的文本过滤模型研究[J].中文信息学报,2017,31(02):126-131+138. 刘绍毓,李弼程,郭志刚,王波,陈刚. 实体关系抽取研究综述[J].信息工程大学学报,2016,17(05):541-547. 鄂海红,张文静,肖思琪,程瑞,胡莺夕,周筱松,牛佩晴. 深度学习实体关系抽取研究综述[J/OL].软件学报:1-28[2019-05-02].https://doi.org/10.13328/j.cnki.jos.005817. 张传岩. Web实体活动与实体关系抽取研究[D].济南：山东大学硕士学位论文,2012. 王敏. 基于多代理策略的中文实体关系抽取[D].大连：大连理工大学硕士学位论文,2011. 陈锦瑞,姬东鸿. 基于图的半监督关系抽取[J].软件学报,2008,19(11):2843-2852．邓耀臣,王同顺. 词语搭配抽取的统计方法及计算机实现[J].外语电化教学,2005(05):26-29. 熊文新. 语言资源视角下的语料库建设与应用研究汉、英[M].北京：外语教学与研究出版社.2015. 章成志. 多语言领域本体学习研究[M].南京：南京大学出版社.2012:85-91. （美）酷奇,（美）罗戈任斯基. 深入理解ElasticSearch[M].北京：机械工业出版社.2016:1-4. 周立. SpringCloud与Docker微服务架构实战[M].北京：电子工业出版社.2017. 邓耀臣,肖德法. 中国大学生英语虚化动词搭配型式研究[J].外语与外语教学,2005(7):7-10. 王立非,张岩.大学生英语议论文中高频动词使用的语料库研究[J].外语教学与研究,2007(02):110-116+160-161. 许家家. 学生的写作错误与写作指导[J].课程教育研究,2018(48):83. 唐锦兰,吴一安. 在线英语写作自动评价系统应用研究述评[J].外语教学与研究,2011,(2):273-282. 杨惠中. 语料库语言学的应用研究与贡献[J]．现代外语，2010(04). 吴伟成,周俊生,曲维光. 基于统计学习模型的句法分析方法综述.2013. 杨涵舒. 技术说明书的易读性研究[D].北京大学：2013 许家金. 多语种在线语料库检索平台BFSU CQPweb使用简明手册. 许家金,吴良平. 基于网络的第四代语料库分析工具CQPweb及应用实例[J].外语电化教学,2014(05):10-15+56. 梁茂成, 李文中,许家金.《语料库应用教程》.北京：外语教学与研究出版社,2010. 于娜娜. 基于B/S架构的语料库管理系统[D].哈尔滨理工大学.2017. 张乐,刘芹. 中国理工科大学生英语写作语料库的设计、构建与前景.当代外语研究.2017(03):80-83. 葛晓华. Sketch Engine的核心功能和应用前景[J].外语电化教学,2017(04):23-30. Black, P. & D. Wiliam. Assessment and classroom learning[J]. Assessment in Education, 1998,5(1):7-74. Laufer, B. & Nation, P. Vocabulary Size and Use: Lexical Richness in L2 Written Production[J]. Applied Linguistics, 1995(16):307-322. Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[J]. 2016:427-431. fastText. https://fasttext.cc/docs/en/support.html. FacebookResearch. Chinchor N, Marsh E. Muc-7 Information Extraction Task Definition[C] // Proceeding of the Seventh Message Understanding Conference(MUC-7). 1998:359-367. Brin S. Extracting patterns and relations from the world wide web[M]. Berlin: Springer Heidelberg, 1999:172-183. Sinclair, J. Corpus, Concordance, Collocation[M]. Oxford: Oxford University Press, 1991. 上海:上海外语教育出版社,1999． Jones S, Sinclair J. English lexical collocations: A study in computational linguistics[J]. Cahiers-de-Lexicologie. 1974, Vol.24(1), 15-61. Choueka, Y., Klein, T. and Neuwitz. E. Automatic Retrieval Of Frequent Idiomatic and Collocational Expressions in A Large Corpus[J]. Journal for Literary and Linguistic Computing. 1983(4):34-38. Church, Kenneth Ward, Patrick Hanks. Word Association Norms, Mutual Information and Lexicography[J]. Computational Linguistics, 1990,Vol.16(1):22-29. Smadja, F. Retrieving Collocation from Text: XTract[J]. Computational Linguistic, 1993,Vol.19(1). Lin Dekang. A dependency-based method for evaluating broad-coverage parsers[J]. Natural Language Engineering, 1998,4(2):97-114. Adam Kilgarriff, Vít Baisa, Jan Bušta, Miloš Jakubíček, Vojtěch Kovář, Jan Michelfeit, Pavel Rychlý,Vít Suchomel. The Sketch Engine: ten years on[J]. Lexicography, 2014,1(1). Barono, M. & S. Bernardini. BootCaT: Bootstrapping Corpora and Terms from the Web. Proceedings of 4th International Conterence on Language Resources and Evaluation. 2004:1313-1316. McEnery, T & Hardie．A Corpus Linguistics: Method, Theory and Practice[M]．Cambridge: Cambridge University Press. 2012. ACE 2005. The Automatic Content Extraction (ACE) Projects [EB/OL][2007-01-11].http://www.ldc.upenn.edu/Projects/ACE/. http://www.bfsu-corpus.org/static/corpus_tools/CQPweb_guide.pdf Christopher D. Manning, Mihai Surdeanu, John Bauer. The Stanford CoreNLP Natural Language Processing Toolkit. 2014. British National Corpus, https://corpus.byu.edu/bnc/, Oxford University. https://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/tregex/TregexPattern.html AG's corpus of news articles, http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html ﹀
公开日期：	2019-06-12

基于文献的中医经方靶点预测关键技术研究.张琢

链接

题名：	基于文献的中医经方靶点预测关键技术研究
姓名：	张琢
学号：	1601210876
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	中国科学技术信息研究所
论文答辩日期：	2019-05-29
关键词：	基于文献中医经方成分提取靶点预测
论文摘要：	︿经方是中医经典方剂的略称，靶点是药物发挥临床疗效时在人体内发生作用的结合位点。纵观国内外文献，基于经方进行靶点研究的文献并不多。究其原因，中医经方在国际上的影响力有限，国外学者受制于语言限制，以经方为单位进行靶点研究文献寥寥可数。国内学者虽然有天然的语言优势，但是受制于诊疗理念的不同，以经方为单位研究靶点的文献数量不多。而且，以经方为单位进行靶点研究，要人工检索文献，确定古籍中经方的配方，以及书籍和期刊论文中的经方成分，恰当处理同义词，将成分翻译成恰当的英文，再从靶点数据库获取对应的靶点，然后才能开始进行实验研究。部分经方配方有多个版本，每个配方有多种药材，药材又有众多成分，成分又对应着不同的靶点，且这些术语在不同文献中表达方式也不统一，可见科研人员在进行真正的医学实验之前，要进行大量的文献调研工作。所以，本文旨在从文献中发现经方的靶点，给科研人员提供一定的参考，减轻文献调研的压力。从文献中发现经方的靶点，最简单的方式就是直接找到研究经方靶点的文献，提取信息。但是，由于这类文献较少，所以本文构建了从经方到靶点的不同路径，建立从经方到靶点之间的联系。经方由药材构成，药材又由成分组成，成分和靶点在靶点数据库中可以建立直接联系。因此，本文先提取经方名称、经方配方和经方成分，然后进行靶点提取，再借助靶点数据库，构建靶点筛选模型，最后将不同来源的靶点汇总，计算靶点置信度，给出经方靶点预测的列表。本文将经方靶点预测的业务流程分解为经方配方提取、经方成分提取和经方靶点获取和经方靶点预测四个模块，并围绕这四个模块展开了关键技术的研究。在配方提取上，本文实现了通过解析多本经典书籍提取经方配方，自动化对比结果，并给出相对一致的经方配方，解决了多版本文献资源提取关键信息的问题，这部分源于业务却不拘泥于业务本身，相反是通过简单的业务流程，来验证技术的可行性，为处理类似问题提供一个解决方法。在成分提取上，本文通过期刊影响因子和是否为核心期刊对期刊质量进行分类，以经典书籍与优质期刊论文为主，一般文献作为辅助，用规则和统计结合的方式，提取成分术语。在成分提取模型中，本文先用通用分词工具切词，筛选关键句，再使用最大逆匹配对关键句重新切词分词，保证了包含特殊符号的成分术语可以被切分并提取出来。同时，本文还引入了添加了规则的Bi-gram模型，并通过计算词频、互信息和信息熵来发现来新的成分术语，减轻了切分词工具对初始分词词典的依赖。在靶点获取上，文本提出了两种方式。第一种方式是通过正则表达式直接从文献中提取靶点，先直接提取经方的靶点，再提取其配方药材，然后提取经方成分的靶点。提取方式基本相同，但是三者与经方靶点的相关性依次递减，所以对应的靶点置信度也以相应减小。另一种是先借助靶点数据库获取经方成分的靶点，再通过对成分和靶点共现文献进行分类，来实现对靶点的筛选。本文先从经典书籍中获取成分的英文翻译，再从Drugbank靶点数据库中获取对应成分的靶点。我们认为如果成分和靶点的关系是可信的，那么二者的共现文献中就一定有合适的文献参考。也就是说，我们通过这种方式，将靶点筛选问题转化为经方成分和靶点共现文献分类问题。所以，我们获取了经方和靶点共现的文献，并对文献进行人工标注，把可以佐证靶点的文献作为靶点依据文献，将是否为靶点依据文献看成一个二分类问题，并将是否具有靶点依据文献作为靶点筛选的初步依据。我们选取了几种经典的文本特征，用朴素贝叶斯、KNN和SVM分类器分别进行特征寻优对比实验，最终确定了用信息增益和卡方检验结合作为特征，用SVM分类模型进行靶点依据文献分类，借此实现了靶点的初步筛选。最后，本文构建了综合的经方靶点预测模型，提出了经方靶点置信度评分模型，根据靶点的来源、靶点依据文献的数量和相关度对靶点进行评分。本文还用构建的预测模型进行了综合实验，先以芍药甘草汤为例进行阈值寻优实验，再以大黄黄连泻心汤和四逆汤为例，验证预测模型的有效性。﹀
分类号：	TP3
论文总页数：	71
参考文献总数：	80
参考文献列表：	︿ [1] 王海英, 刘旭东, 王好良. 经方是中药开发的源泉[J]. 中国药房, 2007, 18(21):1675-1677.. [2] Yan D . [Investigation on pattern and methods of quality control for Chinese materia medica based on dao-di herbs and bioassay - bioassay for Coptis chinensis].[J]. Acta Pharmaceutica Sinica, 2011, 46(5):568. [3] 朱燕波, 王琦, 折笠秀树. 日本汉方循证医学研究的困难性、现状及其对策[J]. 中华中医药杂志, 2004, 19(9):548-550. [4] 黄熙. 方剂体内／血清成分谱与靶成分概念的提出及意义[J]. 医学争鸣, 1999, 20(4):277-279. [5] 任平, 黄熙. 新概念药物的源泉之一:方剂血清靶成分[J]. 中草药, 2000, 31(8):637-638. [6] Hopkins, Andrew L. Network pharmacology: the next paradigm in drug discovery[J]. Nature Chemical Biology, 2008, 4(11):682-690. [7] 屠呦呦. 我有一个希望[J]. 中国科技奖励, 2015(10):6-8. [8] 阎琪,张瑞彬,张海洋, 等.《伤寒论》版本研究概述[J].长春中医药大学学报,2015,31(3):635-637. [9] 林大勇,王树鹏,傅海燕, 等.3种不同版本的翻刻宋版《伤寒论》比较研究[J].吉林中医药 ,2011,(2). [10]张杰.小八角莲活性成分提取分离、质量控制及药效研究[D].湖南:中南大学,2010. DOI:10.7666/d.y1918428. [11]Lin Y , Mehta S , Kü?ük-McGinty, Hande, et al. Drug target ontology to classify and integrate drug discovery data[J]. Journal of Biomedical Semantics, 2017, 8(1):50. [12]Drugbank: https://www.drugbank.ca [13]Soufan O , Ba-Alawi W , Afeef M , et al. DRABAL: novel method to mine large high-throughput screening assays using Bayesian active learning[J]. Journal of Cheminformatics, 2016, 8(1):64. [14]Kiyatkin EA, Brown PL: The role of peripheral and central sodium channels in mediating brain temperature fluctuations induced by intravenous cocaine. Brain Res. 2006 Oct 30;1117(1):38-53. Epub 2006 Sep 7 [15]Lin Y , Mehta S , Kü?ük-McGinty, Hande, et al. Drug target ontology to classify and integrate drug discovery data[J]. Journal of Biomedical Semantics, 2017, 8(1):50. [16]姚新生,胡柯.中药复方的现代化研究[J].化学进展,1999,19(2):192-196. DOI:10.3321/j.issn:1005-281X.1999.02.012. [17]陈仁寿.中医内科学术流派历史沿革述略[C]南京中医药大学,2014:20-23. [18]程林顺,杨静,王艳桥.中医药文化在中华传统文化中的哲学意蕴及价值拓展[J].中国卫生事业管理,2018,35(9):717-720. [19]马壮,闫风.中药有效成分研究现状分析[J].长春中医药大学学报,2008(04):380-381. [20]闫良,张佳丽,田鑫.核受体调控中药-药物相互作用的研究现状[J].中国临床药理学杂志,2019,35(02):184-187. [21]杨世磊,刘克辛.药物转运体介导的中药及单体药物相互作用的研究进展[J].药物评价研究,2019,42(01):194-203. [22]祝婉芳.从经方论中药配伍与疗效[J].长春中医药大学学报,2014,30(5):829-830. DOI:10.13463/j.cnki.cczyy.2014.05.026. [23]陈竺.系统生物学--21世纪医学和生物学发展的核心驱动力[J].世界科学,2005,(3):2-6. [24]Hopkins A L. Network pharmacology. [J]. Nature Biotechnology, 2007, 25(10):1110-1111. [25]刘艾林, 杜冠华. 网络药理学:药物发现的新思想[J]. 药学学报, 2010(12):1472-1477. [26]胡亚洁,赵晓锦,宋咏梅,付先军.基于网络药理学的中药复方研究探讨[J].时珍国医国药,2018,29(06):1400-1402. [27]刘志华, 孙晓波. 网络药理学:中医药现代化的新机遇[J]. 药学学报, 2012(6):696-703. [28]张贵彪,陈启龙,苏式兵.中药网络药理学研究进展[J].中国中医药信息杂志,2013,(8):103-106. DOI:10.3969/j.issn.1005-5304.2013.08.049. [29]周文霞.网络药理学的研究进展和发展前景[J].中国药理学与毒理学杂志,2015,(5):760-762. DOI:10.3867/j.issn.1000-3002.2015.05.051. [30]李翔,吴磊宏,范骁辉, 等.复方丹参方主要活性成分网络药理学研究[J].中国中药杂志,2011,36(21):2911-2915. DOI:10.4268/cjcmm20112102. [31]许海玉,刘振明,付岩, 等.中药整合药理学计算平台的开发与应用[J].中国中药杂志,2017,42(18):3633-3638. [32]朱艳芳, 徐志伟, 敖海清, et al. 调脾护心方的计算机网络药理学研究[J]. 中药新药与临床药理, 2012(1):25-29. [33]李梢.网络靶标:中药方剂网络药理学研究的一个切入点[J].中国中药杂志，2011，36(15):201. [34]邓小敏,郭超峰.网络药理学背景下的中药药效机制及疗效评价研究[J].医学与哲学,2012,33(19):67-68.； [35]Li S , Zhang B . Traditional Chinese medicine network pharmacology: theory, methodology and application[J]. Chinese Journal of Natural Medicines, 2013, 11(2):110-120.. [36]Wu X M , Wu C F . Network pharmacology: A new approach to unveiling Traditional Chinese Medicine[J]. Chinese Journal of Natural Medicines, 2015, 13(1):1-2.. [37]张彦琼,李梢.网络药理学与中医药现代研究的若干进展[J].中国药理学与毒理学杂志,2015,(6):883-892. [39]M Nair, Shyama. A Survey on Drug-Target Interaction Prediction Methods Analysis of Prediction Mechanisms for Drug Target Discovery [J]. International Journal for Research in Applied Science & Engineering Technology,2018,6(3): 363-368. [40]王克强, 吴宏宇, 李国栋, 黄青山. BCTD：一个药物重定位研究用药物靶点数据库[J]. 计算生物学, 2015, 5(3): 41-47. [41]Kuhn M , Mering C V , Campillos M , et al. STITCH: Interaction networks of chemicals and proteins[J]. Nucleic Acids Research, 2008, 36(Database issue):D684-8. [42]李芸. 三拗汤古今文献整理及效应机制挖掘研究[D].南京中医药大学,2014. [43] Ru J , Li P , Wang J , et al. TCMSP: a database of systems pharmacology for drug discovery from herbal medicines[J]. Journal of Cheminformatics, 2014, 6(1):13. [44]史海龙,赵云飞,惠媛,王瑞辉,郭新荣.基于药物靶点从传统中药库中高通量虚拟筛选EGFR-TK抑制剂[J].时珍国医国药,2016,27(09):2300-2304. [45]史海龙,王玉成,樊莹莹,龚佳鑫,郭新荣.基于药物靶点从传统中药库中高通量虚拟筛选HIV-1整合酶抑制剂[J].中国实验方剂学杂志,2016,22(19):159-16 [46]吴磊宏,高秀梅,王林丽, 等.附子多成分作用靶点预测及网络药理学研究[J].中国中药杂志,2011,36(21):2907-2910. DOI:10.4268/cjcmm20112101. [47]叶蕾. 基于系统药理学的四君子汤作用靶点预测及实验研究[D]. 山东中医药大学, 2015. [48]汝锦龙. 中药系统药理学数据库和分析平台的构建和应用[D]. [49]刘洪. 三拗汤及其加味方功效物质靶点网络构建及干预PM2.5诱导加重哮喘的研究[D].南京中医药大学,2017. [50] Zhang A , Sun H , Qiu S , et al. Advancing Drug Discovery and Development from Active Constituents of Yinchenhao Tang, a Famous Traditional Chinese Medicine Formula[J]. Evidence-Based Complementary and Alternative Medicine, 2013, 2013:1-6. [51] Chen S , Jiang H , Cao Y , et al. Drug target identification using network analysis: Taking active components in Sini decoction as an example[J]. Scientific Reports, 2016, 6:24245. [52]王涛,邹文俊,张璐, 等.基于网络药理学预测四逆汤抗心力衰竭的作用机制[J].中国医院用药评价与分析,2017,17(10):1304-1306. DOI:10.14009/j.issn.1672-2124.2017.10.003.. [53]唐策,文检,杨娟, 等.藏药翼首草抗类风湿性关节炎活性成分靶点的网络药理学研究[J].中国药房,2017,(19):2666-2670. DOI:10.6039/j.issn.1001-0408.2017.19.21. [54]王腾宇,白根本.基于生物信息学预测蒲公英干预炎症的"药效网络"及机制研究[J].江苏中医药,2018,(2):73-75.. [55]张文娟, 王永华. 系统药理学原理、方法及在中医药中的应用[J]. 世界中医药, 2015(2):280-286. [56]闫效莺, 康磊, 李润洲. 基于异构网络的标签传播算法预测药物靶点关系[J]. 计算机应用研究, 2017(4). [57]Cannon DC, Yang JJ, Mathias SL, et al. TIN-X: target importance and novelty explorer. Bioinformatics. 2017;33(16):2601-2603. [58]薛为民,陆玉昌.文本挖掘技术研究[J].北京联合大学学报(自然科学版),2005(04):59-63. [59]Yanhong L , Anmeng S , Jingling W . A Survey of Current Work in Medical Text Mining---Data Source Perspective[J]. 2017. [60]齐彬,吕婷.共现分析技术在生物医学信息文本数据挖掘中的应用[J].中华医学图书情报杂志,2009,18(03):41-43. [61]周雪忠.文本挖掘在中医药中的若干应用研究[D].浙江大学,2004. [62]袁毅,张丹,张晓东,谢建明,孙啸.基因相关生物医学文献挖掘研究[J].电脑知识与技术,2008(13):620-623+677 [63]王春锋. 基于整合文本挖掘方法的中医证与分子生物学知识的关联分析系统[D].北京交通大学,2008. [64]查青林,余俊英,余飞,郑光,郭洪涛,吕爱平,于峥,姜淼.基于代谢相关MeSH词文本挖掘分析治疗咳嗽中药五味分类的生物学特征[J].中国中医基础医学杂志,2010,16(07):616-618. [65]谭勇,郭洪涛,郑光,张弛,杨静,吕诚,查青林,姜淼,吕爱平.利用文本挖掘技术探索中医药治疗疾病的用药规律[J].世界科学技术(中医药现代化),2010,12(05):823-827. [66]梁非,展俊平,李立,郑光,吕爱平,姜淼,喻长远.基于文本挖掘方法探索寒性热性中药的病证方药相应规律[J].中国实验方剂学杂志,2013,19(15):333-337. [67]于彤,朱玲,李敬华, 等.中医文本信息抽取系统[J].中国医学创新,2015,(21):108-109,110. [68]Ravikumar K E , Wagholikar K B , Li D , et al. Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature[J]. BMC Bioinformatics, 2015, 16(1):185. [69]Rios A , Kavuluru R . Convolutional neural networks for biomedical text classification: application in indexing biomedical articles[J]. Acm Bcb, 2015, 2015:258-267. [70]Al-Aamri A , Taha K , Al-Hammadi Y , et al. Constructing Genetic Networks using Biomedical Literature and Rare Event Classification[J]. Scientific Reports, 2017, 7(1):15784. [71]Ravikumar K E , Rastegar-Mojarad M , Liu H . BELMiner: adapting a rule-based relation extraction system to extract biological expression language statements from bio-medical literature evidence sentences[J]. Database, 2017, 2017(1). [72]Habibi M , Weber L , Neves M , et al. Deep learning with word embeddings improves biomedical named entity recognition[J]. Bioinformatics, 2017, 33(14):i37-i48. [73]何远标, 乐小虬, 张帆. 学术论文大纲中关键术语抽取方法研究[J]. 数据分析与知识发现, 2014, 30(3). [74]程薇. 基于成分术语提取的药品相互作用检测研究[D]. 安徽理工大学, 2015. [75]殷亚博,杨文忠,杨慧婷, 等.基于卷积神经网络和KNN的短文本分类算法研究[J].计算机工程,2018,44(7):193-198. DOI:10.19678/j.issn.1000-3428.0047596. [76]Yan Rui，Cao Xianbin，Li Kai．Dynamic assembly classifica—tion algorithm for short text[J]．Acta Electronica Sinica，2009，37(5)：1019—1024．(in Chinese) [77]Maron M E, Kuhns J L. On Relevance, Probabilistic Indexing and Information Retrieval[J]. Journal of the Acm. 1960, 7(3):216-244. [78]黄贤英,熊李媛,刘英涛, 等.基于类别特征改进的KNN短文本分类算法[J].计算机工程与科学,2018,40(1):148-154. DOI:10.3969/j.issn.1007-130X.2018.01.022. [79]宋爽.基于特征扩展的短文本分类[D].辽宁:大连理工大学,2018 [80]沈加. 基于SVM模型的新闻分类系统设计与实现[D].电子科技大学,2013. ﹀
公开日期：	2019-06-13

基于网络表示学习的科技简报自动生成关键技术研究.张越

链接

题名：	基于网络表示学习的科技简报自动生成关键技术研究
姓名：	张越
学号：	1601210872
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-29
关键词：	科技简报概念关系标引知识网络网络表示学习科技简报生成
论文摘要：	︿科技简报是科技情报类公文中一个重要的文种，能够为各级决策机关制定科技政策提供参考。但随着“大数据时代”的到来，传统的科技情报内涵、组织模式与应用发生了不同形式的变化，如何从海量的科技报告数据源中提炼出各类重要信息是当前科研人员所面临的重要挑战之一。目前已有的科技简报自动生成系统的核心技术是文本生成，对科技简报的文本特点不具针对性，从而对科技报告的资源使用率不高。另外，现有的文本生成技术不能有效地体现科技政策语料中的丰富知识网络信息。根据以上情况，本文分析并实验前人的研究成果，主要将历年来科技研究所公布的科技简报与其原科技政策文本进行对比分析，通过回溯的方法研究如何从冗长的科技政策文本生成简短的科技简报。其中涉及的关键技术包括概念关系标引，知识网络构建，网络表示学习以及文本生成等，本文最后运用这些技术实现了科技简报自动生成的功能。本文首先对科技政策类文本的文本特征进行分析，分别依照概念与概念间关系的分类体系，对科技政策文本中的概念与概念关系进行自动标引。其中，概念标引采用基于RNN+CRF的深度学习方法，实现在句子中自动识别概念词汇并添加类别标签。关系标引主要分析概念间的关系特征，并采用基于SVM主动学习分类方法，为概念实体对自动标引关系。实验表明本文使用的方法能够有效地标引科技政策文本中的概念词并预测概念词之间的关系。在实现科技政策文本自动标引后，本文进一步研究了如何通过概念与概念关系构建知识网络，并分别提出了概念知识网络和融合篇章结构的知识网络构建方法。基于此知识网络模型，本文采用了一种能够融合节点语义、拓扑结构以及类别标签信息的网络表示学习模型，并引入Node2vec算法和知识推理信息对该模型进行改进。同时，本文还对带篇章结构的知识网络中的篇章节点表示进行分析。最后通过SVM分类器和可视化方式，证明本文提出的网络表示学习方法能更加有效地表示知识网络中的节点。最后，本文将知识网络节点表示分别应用在基于单篇的和基于多篇的科技简报篇章结构和内容的自动生成中。对于科技简报篇章结构的生成，本文采用保留原文结构或者选取重要篇章节点的方法。对于文本内容生成，本文研究了抽取式和生成式两种方式。本文最后针对科技简报文本写作特点，完成科技简报的自动生成功能。﹀
外文摘要：	︿ The scientific briefing is a kind of important document in the scientific and technological information, which can provide the reference for the decision makers at all levels to formulate science and technology policies. However, with the advent of the big data era, the traditional scientific and technological information connotation, organizational model and application have undergone different forms of change. How to extract all kinds of valuable information from the vast amount of scientific and technological report data is the challenge that the current researchers are facing. At present, the core of the automatic generation system is text generation, which ignores the text characteristics of the scientific briefing, so the resource utilization rate of the scientific report is not high. Also, the existing text generation technology cannot adequately reflect the rich knowledge network information in the science and technology policy corpus. Based on the above situation, this paper analyzes and experiments the research results of the predecessors, mainly comparing the scientific and scientific briefings published by the Science and Technology Research Institute over the past years with the original scientific and technological policy texts, and researching how to generate short technology from the lengthy scientific and technological policy texts. The analytical techniques include conceptual and relationship labeling, knowledge network construction, network representation learning and text generation. At the end of the paper, these technologies are used to achieve the automatic generation of scientific briefings function. This paper first analyzes the textual characteristics of science and technology policy categories and automatically labels the concepts and conceptual relationships in the science and technology policy texts according to the classification category of the relationship among concepts. Among them, the concept labeling adopts deep learning method based on RNN+CRF, which realizes the automatic recognition of the concept in the sentence and adds the category label. The relationship labeling mainly analyzes the relationship characteristics between concepts and adopts the SVM active learning classification method to label the relationship between the unmarked concept entity pairs. Experiments show that the technique used in this paper can effectively recognize the concept words in the science and technology policy text and predict the relationship label between concept words. After realizing the automatic labeling of science and technology policy texts, this paper further studies how to construct a knowledge network through the relationship between concepts. Based on this knowledge network model, the paper adopts a network representation learning basic model that can fuse node semantics, topology structure, and category label information, and also introduces Node2vec algorithm and knowledge representation learning algorithms to improve the basic model. At the same time, this paper analyzes the node representation in the knowledge network with chapter structure. Finally, through the SVM classifier and visualization method, it is proved that the network representation learning method proposed in this paper can more effectively represent the nodes in the knowledge network. Finally, this paper applies the knowledge network node representation to the automatic generation of chapter structure and content of single-article oriented and multi-article based scientific briefing. For the generation of the scientific briefing text structure, this paper adopts the method of retaining the original text structure or selecting important chapter nodes. For text content generation, this paper studies the two methods of extractive and abstractive generation. Then, this paper focuses on the characteristics of the sicientific briefing text writing and completes the automatic generation of the scientific briefing. ﹀
分类号：	TP3
论文总页数：	80
参考文献总数：	55
参考文献列表：	︿ [1] 李念峰. 基于自动摘要的网络情报收集系统研究[J]. 现代情报, 2007, 27(11):161-163. [2] 尹显贵. 基于Web的企业竞争情报服务平台中多文本摘要技术研究[D]. 昆明理工大学, 2012. [3] 孟凡坤. 特定领域知识库的构建与简报生成[D]. 北京工业大学, 2014. [4] 张晓艳, 王挺, 陈火旺. 命名实体识别研究[J]. 计算机科学, 2005, 32(4):44-48. [5] Collins M, Singer Y. Unsupervised models for named entity classification[C]//1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. 1999. [6] Bikel D M, Schwartz R, Weischedel R M. An algorithm that learns what's in a name[J]. Machine learning, 1999, 34(1-3): 211-231. [7] Curran J, Clark S. Language independent NER using a maximum entropy tagger[C]//Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003. 2003. [8] McNamee P, Mayfield J. Entity extraction without language-specific resources[C]//proceedings of the 6th conference on Natural language learning-Volume 20. Association for Computational Linguistics, 2002: 1-4. [9] McCallum A, Li W. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons[C]//Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. Association for Computational Linguistics, 2003: 188-191. [10] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J]. Journal of machine learning research, 2011, 12(Aug): 2493-2537. [11] Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015. [12] Pham T H, Le-Hong P. End-to-end recurrent neural network models for vietnamese named entity recognition: Word-level vs. character-level[C]//International Conference of the Pacific Association for Computational Linguistics. Springer, Singapore, 2017: 219-232. [13] Ma X, Hovy E. End-to-end sequence labeling via bi-directional lstm-cnns-crf[J]. arXiv preprint arXiv:1603.01354, 2016. [14] Miller S, Fox H, Ramshaw L, et al. A novel use of statistical parsing to extract information from text[C]//1st Meeting of the North American Chapter of the Association for Computational Linguistics. 2000. [15] Mintz M, Bills S, Snow R, et al. Distant supervision for relation extraction without labeled data[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Association for Computational Linguistics, 2009: 1003-1011. [16] Zelenko D, Aone C, Richardella A. Kernel methods for relation extraction[J]. Journal of machine learning research, 2003, 3(Feb): 1083-1106. [17] Brin S. Extracting patterns and relations from the world wide web[C]//International workshop on the world wide web and databases. Springer, Berlin, Heidelberg, 1998: 172-183. [18] Hasegawa T, Sekine S, Grishman R. Discovering relations among named entities from large corpora[C]//Proceedings of the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics, 2004: 415. [19] Piasecki M, Ramocki R, Kaliński M. Information spreading in expanding wordnet hypernymy structure[C]//Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013. 2013: 553-561. [20] Gonzalez J E, Xin R S, Dave A, et al. Graphx: Graph processing in a distributed dataflow framework[C]//11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14). 2014: 599-613. [21] Low Y, Bickson D, Gonzalez J, et al. Distributed GraphLab: a framework for machine learning and data mining in the cloud[J]. Proceedings of the VLDB Endowment, 2012, 5(8): 716-727. [22] Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014: 701-710. [23] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]//Advances in neural information processing systems. 2013: 3111-3119. [24] 涂存超, 杨成, 刘知远,等. 网络表示学习综述[J]. 中国科学:信息科学, 2017(8):32-48. [25] Tang J, Qu M, Wang M, et al. Line: Large-scale information network embedding[C]//Proceedings of the 24th international conference on world wide web. International World Wide Web Conferences Steering Committee, 2015: 1067-1077. [26] Grover A, Leskovec J. node2vec: Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2016: 855-864. [27] Wang D, Cui P, Zhu W. Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2016: 1225-1234. [28] Yang C, Liu Z, Zhao D, et al. Network representation learning with rich text information[C]//Twenty-Fourth International Joint Conference on Artificial Intelligence. 2015. [29] Tu C, Liu H, Liu Z, et al. Cane: Context-aware network embedding for relation modeling[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017: 1722-1731. [30] Tu C, Zhang Z, Liu Z, et al. TransNet: Translation-Based Network Representation Learning for Social Relation Extraction[C]//IJCAI. 2017: 2864-2870. [31] Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data[C]//Advances in neural information processing systems. 2013: 2787-2795. [32] 宗成庆. 统计自然语言处理(第2版)[M]// 统计自然语言处理. 2008. [33] Carbonell J G, Goldstein J. The Use of MMR and Diversity-Based Reranking for Reodering Documents and Producing Summaries[J]. 1998. [34] Bollegala D, Okazaki N, Ishizuka M. A bottom-up approach to sentence ordering for multi-document summarization[J]. Information processing & management, 2010, 46(1): 89-109. [35] Li C, Qian X, Liu Y. Using supervised bigram-based ILP for extractive summarization[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2013, 1: 1004-1013. [36] Lin H, Bilmes J. A class of submodular functions for document summarization[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 2011: 510-520. [37] Li C, Liu Y, Liu F, et al. Improving multi-documents summarization by sentence compression based on expanded constituent parse trees[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014: 691-701. [38] Bing L, Li P, Liao Y, et al. Abstractive multi-document summarization via phrase selection and merging[J]. arXiv preprint arXiv:1506.01597, 2015. [39] Liu F, Flanigan J, Thomson S, et al. Toward abstractive summarization using semantic representations[J]. arXiv preprint arXiv:1805.10399, 2018. [40] Sutskever I, Vinyals O, Le Q V. Sequence to sequence learning with neural networks[C]//Advances in neural information processing systems. 2014: 3104-3112. [41] Cho K, Van Merriënboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv:1406.1078, 2014. [42] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014. [43] Gehring J, Auli M, Grangier D, et al. A convolutional encoder model for neural machine translation[J]. arXiv preprint arXiv:1611.02344, 2016. [44] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in neural information processing systems. 2014: 2672-2680. [45] Le Q, Mikolov T. Distributed representations of sentences and documents[C]//International conference on machine learning. 2014: 1188-1196. [46] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014. [47] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780. [48] 刘丹丹, 彭成, 钱龙华, et al. 词汇语义信息对中文实体关系抽取影响的比较[J]. 计算机应用, 2012, 32(08):2238-2244. [49] 刘向, 马费成, 陈潇俊, et al. 知识网络的结构与演化——概念与理论进展[J]. 情报科学, 2011(6):801-809. [50] Tu C, Wang H, Zeng X, et al. Community-enhanced network representation learning for network analysis[J]. arXiv preprint arXiv:1611.06645, 2016. [51] Griffiths T L, Steyvers M. Finding scientific topics[J]. Proceedings of the National academy of Sciences, 2004, 101(suppl 1): 5228-5235. [52] Pan S, Wu J, Zhu X, et al. Tri-party deep network representation[J]. Network, 2016, 11(9): 12. [53] Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data[C]//Advances in neural information processing systems. 2013: 2787-2795. [54] 李娜娜, 刘培玉, 刘文锋, et al. 基于TextRank的自动摘要优化算法[J]. 计算机应用研究, 2019(5). [55] Pennington J, Socher R, Manning C. Glove: Global vectors for word representation[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014: 1532-1543. ﹀
公开日期：	2019-06-04

基于文本分析与计算的科技政策扩散关键技术研究.张丽颖

链接

题名：	基于文本分析与计算的科技政策扩散关键技术研究
作者：	张丽颖
学号：	1601210855
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	刘耀
导师单位：	软件与微电子学院
答辩日期：	2019-05-29
关键字(中文)：	政策扩散科技政策文本挖掘文本分析
文摘：	︿自改革开放以来，我国新的科技政策层出不穷，这些政策在不同层级政府、不同地区之间进行扩散，对各级政府的政策行为和科技治理水平的提高起到重要的影响，但是鲜有学者对上述扩散现象进行深入的研究。同时，现有科技政策扩散研究多以定性分析为主，侧重对政策扩散理论框架、制约因素、扩散实例展开研究，缺少对所提理论和模型的系统性验证；少数定量研究也主要是应用基本的统计分析，人工参与较多，缺少对政策内容和属性的自动挖掘，难以精确地提取扩散关系，挖掘内容变化。基于以上情况，本文在总结前人研究的基础上，针对科技政策扩散特点，重点从结构和语义层面构建了科技政策扩散模型，引入自然语言处理领域的文本分析与计算方法，进行扩散特征的自动提取和政策扩散关系的自动挖掘。（1）对政策领域有意义字符串发现和政策结构提取技术进行研究。首先，针对科技政策中新词术语较多且长度较长，传统分词效果难以达到分析需求的问题，本文提出了基于规则和信息熵的优化方法，实验表明该方法能有效地划分出科技政策文本中绝大部分有意义字符串。对于政策结构，本文分别提出了组织结构提取和扩展方法。首先利用政策行文特点，并结合词频和TextRank算法提取出政策的组织结构。在此基础上，本文构建了科技政策领域结构词表，并根据结构词表对政策的组织结构进行扩展，最终提取出政策的基本面。（2）对政策扩散特征表示和政策扩散关系判定技术进行研究。首先，本文从结构和语义两方面对科技政策扩散特征进行了研究，分别提取了组织结构相关性特征、基本面同一性特征、特征词承继特征以及基于Doc2vec的文本相似性特征。在特征提取的基础上，本文选用决策树分类模型，将关系判断转化为分类问题，实现对多个特征进行一体化处理，实验表明，本文构建的多特征分类模型能有效地判定政策扩散关系。（3）对政策扩散识别技术进行研究。首先，针对同一主题下科技政策扩散情况的分析需求，本文构建了科技政策扩散识别框架，并引入了Ranking SVM模型，融合科技政策扩散特征和文本多样化特征，对模型进行了适应化改进。之后，本文提出了基于排序评分的科技政策排序距离计算方法，寻找使扩散关系成立的最大排序距离，作为扩散识别经验值。然后用这一经验值优化识别模型，实现了检索过程中科技政策扩散对和扩散集的自动计算和输出。实验表明，本文构建的科技政策扩散识别框架能有效地提取出扩散集合，满足了用户对某一主题下的科技政策扩散关系挖掘的分析需求。﹀
分类号：	TP3
论文总页数：	63
参考文献数：	79
参考文献：	︿ [1] 张玉娟, 杨海丽, 孟潇,等. 基于计量可视化分析的科技政策研究现状[C]//北京科学技术情报学会年会--“科技情报发展助力科技创新中心建设”论坛. 2017. [2] Lazer D, Pentland A, Adamic L, et al. Computational Social Science[J]. Science, 2009, 323(1):721-723. [3] 裴雷, 孙建军, 周兆韬. 政策文本计算:一种新的政策文本解读方式[J]. 图书与情报, 2016(6):47-55. [4] 朱亚鹏. 政策创新与政策扩散研究述评[J]. 武汉大学学报(哲学社会科学版), 2010(4):565-573. [5] Rogers, E.M. Diffusion of Innovations[M]. New York: The Free Press. 1983. [6] Lucas A. Public Policy Diffusion Research: Integrating Analytic Paradigm [J]. Science Communication. 1983 ( 4 ) :379-408． [7] 杨代福. 西方政策创新扩散研究的最新进展[J]. 国家行政学院学报, 2016(1):122-126. [8] Walker J L. The Diffusion of Innovations Among the American States[J]. American Political Science Review, 1969,63(3):881-893. [9] Gray V. Innovation in the States: A Diffusion Study[J]. American Political Science Review, 1973, 67(4):1174-1185. [10] Savage R L. Policy innovativeness as a trait of American states[J]. The Journal of Politics,1978,40(1)：212-224. [11] Savage R L. Diffusion research traditions and the spread of policy innovations in a federal system [J]. Publius：The Journal of Federalism,1985,15(4)：1-28. [12] Mohr L B. Determinants of innovation in organizations[J]. American political science review, 1969, 63(1): 111-126. [13] Berry F S, Berry W D. State Lottery Adoptions as Policy Innovations: An Event History Analysis[J]. The American Political Science Review, 1990, 84(2):395. [14] Berry F S, Berry W D. Tax innovation in the states: Capitalizing on political opportunity[J]. American Journal of Political Science, 1992: 715-742. [15] Berry F S. Sizing up state policy innovation research[J]. Policy Studies Journal, 1994,22(3)：42-456. [16] Feiock R C, West J P. Testing competing explanations for policy adoption: Municipal solid waste recycling programs[J]. Political Research Quarterly, 1993, 46(2): 399-419. [17] Mintrom M, Vergari S. Policy networks and innovation diffusion: The case of state education reforms[J]. The Journal of Politics, 1998, 60(1): 126-148. [18] Godwin M L, Schroedel J R. Policy diffusion and strategies for promoting policy change: Evidence from California local gun control ordinances[J]. Policy Studies Journal, 2000, 28(4): 760-776. [19] Hays S P. Patterns of reinvention: The nature of evolution during policy diffusion[J]. Policy Studies Journal, 1996, 24(4): 551-566. [20] Mintrom M. The state-local nexus in policy innovation diffusion: The case of school choice[J]. Publius: The Journal of Federalism, 1997, 27(3): 41-60. [21] Mooney C Z, Lee M H. The temporal diffusion of morality policy: The case of death penalty legislation in the American states[J]. Policy Studies Journal, 1999, 27(4): 766-780. [22] Miller E A. Advancing Comparative State Policy Research: Toward Conceptual Integration and Methodological Expansion[J]. State & Local Government Review, 2004, 36(1):35-58. [23] Strebel F. Visibility and facticity in policy diffusion: going beyond the prevailing binarity[J]. Policy Sciences, 2012, 45(4):385-398. [24] Graham E R, Shipan C R, CraigVolden. The Communication of Ideas across Subfields in Political Science[J]. Ps Political Science & Politics, 2014, 47(2):468-476. [25] 杨启光. 全球化进程中的国际教育政策转移[J]. 比较教育研究, 2009(12):113-117. [26] 包海芹. 国家学科基地政策扩散研究[M]. 北京大学出版社, 2011. [27] 郭璇. 试析全球化语境下创意产业政策的移植和扩散机制[J]. 浙江社会科学, 2015(6):82-86. [28] 张剑,黄萃,叶选挺,等. 中国公共政策扩散的文献量化研究——以科技成果转化政策为例[J]. 中国软科学,2016 (2)：145-155. [29] 林雪霏. 政府间组织学习与政策再生产：政策扩散的微观机制——以“城市网格化管理”政策为例[J]. 公共管理学报, 2015(1):11-23. [30] 王洪涛,魏淑艳. 地方政府信息公开制度时空演进机理及启示——基于政策扩散视角[J].东北大学学报(社会科学版),2015,17(6)：600-606. [31] 裴雷, 张奇萍, 李向举,等. 中国信息化政策扩散中的政策主题跟踪研究[J]. 图书与情报, 2016(6):63-71. [32] 施茜, 裴雷, 邱佳青. 政策扩散时间滞后效应及其实证评测——以江浙信息化政策实践为例[J]. 图书与情报, 2016(6):56-62. [33] 施茜, 裴雷, 李向举, 等. 信息化政策理论与实践的交互扩散研究——以江浙信息化政策样本为例[J]. 情报学报, 2016, 35(10):1081-1089. [34] 王周宾. 新型农村养老保险试点中的政策扩散机制研究[D].2016. [35] 王小杰. 政策扩散视角下中国铁路技术规章管理的文献量化与博弈研究[D]. 2018. [36] 陈芳. 政策扩散、政策转移和政策趋同——基于概念、类型与发生机制的比较[J]. 厦门大学学报(哲学社会科学版), 2013(6):8-16. [37] 刘伟. 国际公共政策的扩散机制与路径研究[J]. 世界经济与政治, 2012(4):40-58. [38] 王浦劬, 赖先进. 中国公共政策扩散的模式与机制分析[J]. 北京大学学报(哲学社会科学版), 2013, 50(6):14-23. [39] 周望. 政策扩散理论与中国“政策试验”研究:启示与调适[J]. 四川行政学院学报, 2012(4):43-46. [40] Sausgruber R, Tyran J R. Are we taxing ourselves? How deliberation and experience shape voting on taxes[J]. Journal of Public Economics, 2011, 95(1):164-176. [41] Volden C. States as Policy Laboratories: Emulating Success in the Children's Health Insurance Program[J]. American Journal of Political Science, 2010, 50(2):294-312. [42] Garrett K N, Jansa J M. Interest group influence in policy diffusion networks[J]. State Politics & Policy Quarterly, 2015, 15(3): 387-417. [43] Desmarais B A, Harden J J, Boehmke F J. Persistent policy pathways: Inferring diffusion networks in the American states[J]. American Political Science Review, 2015, 109(2): 392-406. [44] Wilkerson J D. Large-scale Computerized Text Analysis in Political Science[J]. Annual Review of Political Science, 2017, 20(1):529-544. [45] Svyatkovskiy A, Imai K, Kroeger M, et al. Large-scale text processing pipeline with Apache Spark[C]// IEEE International Conference on Big Data. 2016. [46] Linder F, Desmarais B A, Burgess M, et al. Text as Policy: Measuring Policy Similarity Through Bill Text Reuse[J]. Social Science Electronic Publishing, 2016. [47] Gilardi F, Shipan C R, Wueest B. Policy diffusion: The issue-definition stage[J]. University of Zurich and University of Michigan, 2018. [48] 武学振. 中国省级政府信息政策创新扩散研究[D]. 2016. [49] Comparative Agendas. [EB/OL]. https:.www.comparativeagendas.net/. [50] 汪涛, 谢宁宁. 基于内容分析法的科技创新政策协同研究[J]. 技术经济, 2013, 32(9):22-28. [51] 胡赛全, 詹正茂, 钱悦, 等. 战略性新兴产业发展的政策工具体系研究——基于政策文本的内容分析[J]. 科学管理研究, 2013, 31(3):66-69. [52] Grimmer J, Stewart B M. Text as data: The promise and pitfalls of automatic content analysis methods for political texts[J]. Political analysis, 2013, 21(3): 267-297. [53] 李江, 刘源浩, 黄萃,等. 用文献计量研究重塑政策文本数据分析——政策文献计量的起源、迁移与方法创新[J]. 公共管理学报, 2015(2). [54] 陈慧茹, 肖相泽, 冯锋. 科技创新政策加权共词网络研究——基于扎根理论与政策测量[J]. 科学学研究, 2016(12):12-19. [55] 丁洁兰, 刘细文, 杨立英, 等. 科学计量方法在科技政策研究中应用的实证研究[J]. 图书情报工作, 2017(61):86. [56] Laver M, Benoit K, Garry J. Extracting Policy Positions from Political Texts Using Words as Data[J]. American Political Science Review, 2003, 97(02):311-331. [57] Slapin J B, Proksch S O. A scaling model for estimating time‐series party positions from texts[J]. American Journal of Political Science, 2008, 52(3): 705-722. [58] Hopkins D J, King G. A method of automated nonparametric content analysis for social science[J]. American Journal of Political Science, 2010, 54(1): 229-247. [59] Leifeld P, Haunss S. Political discourse networks and the conflict over software patents in Europe[J]. European Journal of Political Research, 2012, 51(3): 382-409. [60] Nowlin M C. Modeling issue definitions using quantitative text analysis[J]. Policy Studies Journal, 2016, 44(3): 309-331. [61] 李辉, 曾文, 吴晨生, 等. 中文科技政策数据分析方法研究——以新能源汽车领域科技政策为例[J]. 现代情报, 2018, v.38；No.324(06):70-74. [62] 保罗·A.萨巴蒂尔. 政策过程理论[M]. 三联书店, 2004. [63] 孙蕊, 吴金希, 王少洪. 中国创新政策演变过程及周期性规律[J]. 科学学与科学技术管理, 2016, 37(3):13-20. [64] 李庆. 科技创新政策的转移、转移网络和竞争力研究：以国家自主创新示范区为例[D]. 2017. [65] Piwowar H. Altmetrics: Value all research products[J]. Nature, 2013, 493(7431): 159. [66] 苏竣, 黄萃. 中国科技政策要目概览[M]. 北京:科学技术文献出版社,2012. [67] Lundvall B Å, Borrás S. Science, technology and innovation policy[J]. The Oxford handbook of innovation, 2005: 599-631. [68] 苏竣. 公共科技政策导论[M]. 科学出版社, 2014. [69] 常耀成, 张宇翔, 王红, 等. 特征驱动的关键词提取算法综述[J]. 软件学报, 2018, v.29(07):224-248. [70] Entman R M, Rojecki A. Freezing out the public: Elite and media framing of the US anti‐nuclear movement[J]. 1993. [71] 樊梦佳, 段东圣, 杜翠兰, 等. 统计与规则相融合的领域术语抽取算法[J]. 计算机应用研究, 2016, 33(8). [72] 张越, 刘琦岩, 张玄玄, 望俊成. 科技成果转化政策文本中的领域关键词汇提取研究[J].中国科技资源导刊,2018,50(03):68-75. [73] 曾文, 李智杰, 王小玉, 等. 科技政策术语自动识别技术初探[J]. 中国科技资源导刊, 2017(3). [74] 国家行政机关公文处理办法. [EB/OL]. http:.www.gov.cn/gongbao/content/2000/content_60454.htm. 2000. [75] 魏伟, 郭崇慧, 陈静锋. 国务院政府工作报告(1954—2017)文本挖掘及社会变迁研究[J]. 情报学报, 2018, v.37(04):70-85. [76] Le Q V, Mikolov T. Distributed representations of sentences and documents[J]. arXiv:1405.4053, 2014. [77] 苏金树, 张博锋, 徐昕. 基于机器学习的文本分类技术研究进展[J]. 软件学报, 2006, 17(9):1848-1859. [78] Herbrich R. Large margin rank boundaries for ordinal regression[J]. Advances in large margin classifiers, 2000: 115-13. [79] Joachims T. Optimizing search engines using clickthrough data[C]// Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2002: 133-142. ﹀
公开日期：	2022-06-12

基于蒙特卡罗算法的皮肤病诊疗路径关键技术研究.张瑾

链接

题名：	基于蒙特卡罗算法的皮肤病诊疗路径关键技术研究
作者：	张瑾
学号：	1601210849
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	刘耀
导师单位：	软件与微电子学院
第二导师姓名：	高志军
第二导师单位：	软件与微电子学院
答辩日期：	2019-05-29
题目(外文)：	Research on Key Technologies of Dermatosis Diagnosis and Treatment Path Based on Monte Carlo Algorithms
关键字(中文)：	皮肤病诊疗病历分析蒙特卡罗算法最短路径
关键字(外文)：	Dermatology Diagnosis and Treatment Medical Record Analysis Monte Carlo Algorithms The Shortest Path
文摘：	︿信息技术经过60余年的发展，已经普及到社会生活的各个方面。随着信息技术在各医学各领域的应用，大量数据随之产生。皮肤病是常见病及多发病，相关的病症种类多达一千多种，病历数据具有巨大的价值，其语义知识点可用于临床辅助诊疗和健康管理。目前全国皮肤科诊疗室面临着等待时间长、就医用药难、医师诊疗不准等多重问题，皮肤科医师迫切需要一种可以自动化推荐的计算机辅助诊疗工具，以辅助决策和智能医疗诊断。本文提出一种基于蒙特卡罗算法快速识别皮肤病诊疗依据知识点形成皮肤病诊疗最短路径的方法，用于给医师提供下一步最优推荐。本文在对皮肤病病历结构及内容系统分析及总结归纳的基础上，以《皮肤科诊疗常规》为诊疗依据判定基础，结合蒙特卡罗算法特点及优势，提出了一套以病历诊疗为数据集、以诊疗依据提取与结构化为研究对象、生成皮肤病诊疗路径并基于蒙特卡罗算法计算训练出最短化方案，并通过实验研究验证该方案的可行性。本文研究重点在于如何通过对皮肤病传统的诊疗方法进行分析建模，形成一个能适应于蒙特卡罗算法进行计算的矩阵，如何根据病历及诊疗手册的结构与内容的对应关系提取出皮肤病诊疗依据，如何应用蒙特卡罗算法模拟计算、调参生成皮肤病诊疗最短路径，为诊疗提供支持。本文的研究工作具体如下：分析皮肤病病历及诊疗手册的文本特征，对文本语义与结构信息进行深入挖掘，从中提取诊疗依据知识点的语义集合。基于文本分析方法模型和机器学习技术，形成能适应于蒙特卡罗算法计算的矩阵，构建出皮肤病诊疗模型。基于蒙特卡罗算法，探索并实现诊疗过程表示及结构化生成、诊疗路径计算与最短化处理关键技术，计算出皮肤病诊疗的最短路径。最后，通过实验论证了上述方法的有效性，可应用于下一步最优诊疗依据推荐。﹀
文摘（外文）：	︿ Information technology has spread to society as a result of the development of more than 60 years. A large amount of data is generated with information technology applied in the field of medical science. Dermatosis is common and frequently-occurring, and there are more than 1,000 kinds of dermatoses now. Medical record data is of great value and its semantic knowledge points can be used for clinical assisted diagnosis and health management. At present, the national dermatology clinics face many problems such as long waiting time, difficulty in medical medicine, and inaccuracy in doctors' diagnosis. Thus, dermatologists need a computer-aided diagnosis tool urgently that can be recommended automatically to assist decision-making and intelligent medical diagnosis. This paper proposes a method to quickly extract diagnosis knowledge points to identify the shortest path of dermatological diagnosis and treatment based on Monte Carlo algorithms, which can be used to provide doctors with the recommendation for the next step. This paper proposes a scheme for calculating the shortest path of dermatology diagnosis and treatment based on Monte Carlo algorithms after systematic analysis and summary of the structure and content of dermatology medical records. It takes the Routine of Dermatology Diagnosis And Treatment as the basis of diagnosis and treatment, the advantages of Monte Carlo algorithms, and the data set of diagnosis and treatment of dermatology to extract and structuralize of diagnosis knowledge points. Thus the shortest scheme is proposed, and the feasibility of the scheme is verified by experimental study. The focus of this paper is how to model traditional diagnosis and treatment methods to form a large matrix that can be adapted to Monte Carlo algorithms, how to extract the knowledge points of dermatology diagnosis and treatment corresponding to the structure and content of medical records and the Routines of Dermatology Diagnosis and Treatment, and how to adjust the parameters and calculate the shortest path of dermatosis diagnosis and treatment based on Monte Carlo algorithms. The research work can be expressed as follows: first, analyzing the text characteristics of dermatological medical records and the Routines of Dermatology Diagnosis and Treatment. Based on the modeling of text structure, the knowledge points of diagnosis and treatment through automatic rule extraction are realized. Then, a large matrix suitable for Monte Carlo algorithms is formed on the basis of text analysis method model and machine learning technology, and a dermatological diagnosis and treatment model is constructed. After the diagnosis and treatment processes are represented and structured, Monte Carlo algorithms are used to evolve the diagnosis and treatment paths to calculate the shortest path of dermatological diagnosis and treatment. Finally, the effectiveness of the above methods is demonstrated by experiments, and the recommended system for the optimal diagnosis and treatment of dermatological intelligent diagnosis and treatment is designed and implemented. ﹀
分类号：	TP3
论文总页数：	70
参考文献数：	105
参考文献：	︿肖郑颖 . 老化皮肤光分布的蒙特卡洛模拟[j]. 生物医学工程研究, 2017. 吴淑莲. 老化皮肤光学特征提取及其治疗过程监测[d]. 福建师范大学, 2011. 朱学骏. 我国皮肤病的基础与临床发展现状[j]. 中国医学科学院学报, 2009, 31(1). 陈梅[1] , 吕晓娟[1] , 张麟[2] , et al. 人工智能助力医疗的机遇与挑战[j]. 中国数字医学, 2018. lawrence d r, palaciosgonzález c, harris j. artificial intelligence. [j]. cambridge quarterly of healthcare ethics, 2016, 25(2):250-261. csdn.(2018).alphago背后的力量：蒙特卡洛树搜索入门指南.https://blog.csdn.net/np4rh i455vg29y2/article/deta 胡波. 基于知识库的aiscp导医系统的设计与实现[d]. 苏州大学. 黄欢, 赵钢. 人工智能在医疗及神经病学领域的应用[j]. 华西医学, 2018, v.33(06):10-14. 陈仰东. 新体制下的“三医联动”及实现路径[j]. 中国医疗保险, 2018, 122(11):25-28. 谢俊祥, 张琳. 人工智能在皮肤病诊断中的应用[j]. 中国医疗器械信息, 2018, 24(17):31-33+145. 优麦医生. (2019).构建中国皮肤科专属生态圈. http://web.umer.com.cn/news 腾讯资讯. (2018). 盘点全球11个皮肤病ai项目：63%用于医生端，中国企业最多，皆与顶级医院合作. https://xw.qq.com/cmsid/20180508a09e4o00 thissen m , udrea a , hacking m , et al. mhealth app for risk assessment of pigmented and nonpigmented skin lesions - a study on sensitivity and specificity in detecting malignancy[j]. telemedicine and e-health, 2017, 23(12). wolffenbuttel r f , wolffenbuttel hosli t m . medical apps in need of optical microspectrometers[j]. microsystem technologies, 2016, 22(7):1549-1555. rickert de, h. (2018). how machine learning technology detects skin cancer. https:// www.skinvision.com/ articles/ how-machine-learning-detects-skin-cancer esteva a , kuprel b , novoa r a , et al. dermatologist-level classification of skin cancer with deep neural networks[j]. nature, 2017, 542(7639):115-118. 邱龙杰.(2018).ai应用于医疗预测需整合机器学习行为演算法.https:// www.digitimes.com.tw/iot/article.asp?cat=158&id=0000536817_oll7vp4f2fcvj46ayqgvr 张丽, 商洪涛, 王彪, et al. 医院微信服务平台的设计与实现[j]. 中国医学装备, 2015(10):46-48. 颜红梅. 医学知识工程生产线与基于人工神经网络和遗传算法的医学决策支持系统的研究[d]. 重庆大学, 2003. 张慧玲, 宁立, 孟金涛, et al. 大规模图处理研究[j]. 网络新媒体技术, 2014, 3(1):26-30. 曹银. 试论手机媒体与图书出版[j]. 传播与版权, 2013(6):98-99. sharma m, clark h, armour t, et al. acute stroke: evaluation and treatment[j]. evid rep technol assess, 2005(127):1-7. hoeppner m a. ncbi bookshelf: books and documents in life sciences and health care[j]. nucleic acids research, 2013, 41(database issue):d1251-d1260. shekelle p g , morton s c , keeler e b . costs and benefits of health information technology.[j]. health affairs, 2006, 28(2):w282-93. gartlehner g , hansen r a , nissman d , et al. criteria for distinguishing effectiveness from efficacy trials in systematic reviews[m]// cell interactions in differentiation :. academic press, 2006. cates s. ncbi: national center for biotechnology information[j]. connexions, 2006. stoesser g, griffith m, griffith o l. ncbi (national center for biotechnology information)[m]// dictionary of bioinformatics and computational biology. 2014. 米洋. 基于xml的电子病历系统的设计与实现[d]. 河北科技大学, 2010. 俞文敏, 马培英, 黄美红. “军卫一号”护士工作站软件缺陷和解决对策[j]. 解放军医院管理杂志, 2006, 13(6):515-516. 李昊旻. 电子病历的标准化结构化方法研究及实践[d]. 浙江大学生物医学工程与仪器科学学院, 2007. 王海波. 新型医疗服务模式下电子病历管理的研究[d]. 山东师范大学, 2010. 薛万国. 我国电子病历研究进展[j]. 中国医院管理, 2005, 25(2):17-19. waegemann c p. the five levels of electronic health records[j]. m.d.computing computers in medical practice, 1996, 13(3):199. moen a, henry s b, warren j j. representing nursing judgements in the electronic health record[j]. journal of advanced nursing, 2010, 30(4):990-997. wang x, chase h, markatou m, et al. selecting information in electronic health records for knowledge acquisition[j]. journal of biomedical informatics, 2010, 43(4):595-601. mohammedrajput n a, smith d c, mamlin b, et al. openmrs, a global medical records system collaborative: factors influencing successful implementation.[j]. amia. annual symposium proceedings / amia symposium. amia symposium, 2011, 2011(694):960. palmer r , simmscendan j , kim m . implementing an electronic health record as an ive measure of care provider accountability for a resource-poor rural area in the dominican republic[c]// international conference on appropriate healthcare technologies for developing countries. iet, 2013. clay-williams r, nosrati h, cunningham f c, et al. do large-scale hospital- and system-wide interventions improve patient outcomes: a systematic review[j]. bmc health services research, 2014, 14(1):369. 余本功, 李娜, 江澍, et al. research on information integration platform for electronic health records based on the third party基于第三方的电子病历信息整合平台研究[j]. 计算机系统应用, 2008, 17(5):2-5. sun k h , hune c , keun l i . development of an electronic claim system based on an integrated electronic health record platform to guarantee interoperability[j]. healthcare informatics research, 2011, 17(2):101-. tychalas d . planning and development of an electronic health record client based on the android platform[c]// informatics. ieee, 2010. keshavjee k, mirza k, martin k. the next generation emr[j]. stud health technol inform, 2015, 208:210-214. sauer r , elke m . role of the electronic patient record in the development of general practice in the netherlands[j]. methods of information in medicine, 1999, 38(04/05):350-354. rossi m a, consorti f, galeazzi e. standards to support development of terminological systems for healthcare telematics.[j]. methods inf med, 1998, 37(04/05):551-563. smith b, ceusters w. hl7 rim: an incoherent standard.[j]. studies in health technology & informatics, 2006, 124(124):133. klein a, ganslandt t, brinkmann l, et al. experiences with an interoperable data acquisition platform for multi-centric research networks based on hl7 cda.[j]. methods of information in medicine, 2007, 46(05):580-585. blanquer i, hernandez v, salavert j, et al. using grid-enabled distributed metadata database to index dicom-sr.[j]. studies in health technology & informatics, 2009, 147:117. 李晓雅. 卫生部出台电子病历基本规范[j]. 中国社区医师(医学专业), 2010(11):149-149. 卫生部. 病历书写基本规范(试行)[j]. 中国卫生法制, 2002, 1(5):183-186. 基于蒙特卡罗树搜索的计算机扑克程序[d]. 北京邮电大学, 2014. 季辉, 丁泽军. 双人博弈问题中的蒙特卡洛树搜索算法的改进[j]. 计算机科学, 2018, 45(1):140-143. 董兆安. 二叉树枚举算法的研究[d]. 华东师范大学, 2005. 熊俊, 肖先勇, 邓武军,等. 基于广度优先搜索算法和区域节点行向量法的复杂配电网络可靠性评估[j]. 电网技术, 2007, 31(9):27-32. 陶华, 杨震, 张民,等. 基于深度优先搜索算法的电力系统生成树的实现方法[j]. 电网技术, 2010, 34(2):120-124. 段莉琼, 朱建军, 王庆社,等. 改进的最短路径搜索a*算法的高效实现[j]. 海洋测绘, 2004, 24(5):20-22. crevier b, cordeau j f, laporte g. the multi-depot vehicle routing problem with inter-depot routes[j]. european journal of operational research, 2007, 176(2):756-773. 张杨, 亚森·艾则孜, 郭文强,等. 深度搜索在舆情控制系统中的应用研究[j]. 信息网络安全, 2013(4):91-92. 冯冲. 类人答题系统中的不等式问题自动求解的研究与实现[d]. 彭丽, 李茂军. “机器人游中国”路径优化方法[j]. 工业控制计算机, 2012, 25(12):1-3. 佚名. 基于多智能体的运营高速铁路救援仿真研究[j]. 铁路计算机应用, 2018, 27(7):66-71. 张加佳. 基于uct算法的非完备信息多人军棋博弈系统[d]. 哈尔滨工业大学, 2008. 李营花, 张维, 黄祖广. 基于蒙特卡罗法的数控机床可靠性仿真[j]. 制造技术与机床, 2017(1):33-37. 陈经. 计算机处理围棋复杂的能力压倒了人类[j]. 物理, 2017, 46(9):616-623. zhao j , qiu x , zhang s , et al. part-of-speech tagging for chinese-english mixed texts with dynamic features[c]// proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. association for computational linguistics, 2012. qiu x, gong j, huang x. overview of the nlpcc 2017 shared task: chinese news headline categorization[c]// national ccf conference on natural language processing & chinese computing. 2017. 王海明. 基于tf-idf改进计算模型的实时大数据处理系统设计与实现[d]. zhou l, zhang d. nlpir: a theoretical framework for applying natural language processing to information retrieval[j]. journal of the american society for information science & technology, 2010, 54(2):115-123. 孙琳. 基于nlpir汉语分词系统和bfsu powerconc 1.0的警务汉语词频与搭配研究——以禁毒案件为例[j]. 现代语文：语言研究版, 2016(12):140-145. li x , zhang c . research on enhancing the effectiveness of the chinese text automatic categorization based on ictclas segmentation method[c]// ieee international conference on software engineering & service science. 0. csdn. (2018) 中文分词原理及分词工具绍. https://blog.csdn.net/qq_26598445/article/de tails/81298456 min s, chambers t. text mining with the stanford corenlp[m]// measuring scholarly impact. 2014. emani c k, cullot n, nicolle c. understandable big data: a survey[j]. computer science review, 2015, 17:70-81. solaimani m, gopalan r, khan l, et al. spark-based political event coding[c]// ieee second international conference on big data computing service & applications. 2016. burgard w, brock o, stachniss c. crf-matching: conditional random fields for feature-based scan matching[m]// robotics:science and systems iii. 2007. taub l . applying conditional random fields to payload anomaly detection with crfpad[c]// southeastcon, ieee. ieee, 2013. hong, tzungpei, lin, et al. using tf-idf to hide sensitive itemsets[j]. applied intelligence, 2013, 38(4):502-510. hakim a a , erwin a , eng k i , et al. automated document classification for news article in bahasa indonesia based on term frequency inverse document frequency (tf-idf) approach[c]// international conference on information technology & electrical engineering. ieee, 2015. zhenjun l i, zhou z. improvement of term frequency-inverse document frequency algorithm based on document triage[j]. journal of computer applications, 2015. 佚名. 基于synonyms、k-means的短文本聚类算法[j]. 电脑知识与技术, 2019. 施聪莺, 徐朝军, 杨晓江. tfidf算法研究综述[j]. 计算机应用, 2009, 29(b06):167-170. 路永和, 李焰锋. 改进tf-idf算法的文本特征项权值计算方法[j]. 图书情报工作, 2013, 57(3):90-95. 张瑾. 基于改进tf-idf算法的情报关键词提取方法[j]. 情报杂志, 2014(04):153-155. anandkumar a, foster d p, hsu d, et al. a spectral algorithm for latent dirichlet allocation[j]. algorithmica, 2015, 72(1):193-214. ganesan a, brantley k, pan s, et al. ldaexplore: visualizing topic models generated using latent dirichlet allocation[j]. 2015. chen j, li k, zhu j, et al. warplda: a cache efficient o(1) algorithm for latent dirichlet allocation[j]. proceedings of the vldb endowment, 2016, 9(10):744-755. martinez o, tsechpenakis g. integration of active learning in a collaborative crf[c]// ieee computer society conference on computer vision & pattern recognition workshops. 2008. yang r n b. an online learned crf model for multi-target tracking[c]// computer vision & pattern recognition. 2012. shen f, rui g, yan s, et al. semantic segmentation via structured patch prediction, context crf and guidance crf[c]// ieee conference on computer vision & pattern recognition. 2017. 王岩, 尹海丽, 窦在祥. 蒙特卡罗方法应用研究[j]. 青岛理工大学学报, 2006, 27(2):111-113. 向文武. 基于决策树与蒙特卡罗模拟集成模型的石油勘探投资决策分析[j]. 当代石油石化, 2017, 25(1):44-49. 刘子正, 卢超, 张瑞友. 基于蒙特卡罗树搜索的“2048”游戏优化算法[j]. 控制工程, 2016, 23(4):550-555. max.book. (2016). 蒙特卡罗博弈方法. https://max.book118.com/html/2016/0815/5147573 3.shtm 何长春, 廖继海, 杨小宝. 平面团簇稳定结构的蒙特卡罗树搜索[j]. 物理学报, 2017, 66(16):82-88. csdn. (2016). 一种ucb1算法的简单实现及效果对比. https:// blog.csdn.net/ wangweiran1/article/details/50533275 基于静态评估的计算机围棋uct算法改进研究[d]. 南昌航空大学, 2015. 佚名. 计算机围棋博弈中uct算法的应用及改进[d]. 北京邮电大学, 2011. 柴伟凡, 梁志伟, 夏晨曦. 基于蒙特卡洛树搜索的仿真足球防守策略研究[j]. 微型机与应用, 2017, 36(23):54-57+61. nasser m, kurosh r. cheminform abstract: the first synthesis of allyl isonitriles from baylis-hillman adducts, and their application in the synthesis of substituted imidazo[1,2-a]pyridines and tetraazadibenzoazulenes[j]. synthesis, 2009, 2009(03):431-437. 福蒙蒙, 陈躲, 谢晶. 第四范式:让人工智能触手可及[j]. 全球商业经典, 2017. 蒋卫民. “深层学习”的思考[j]. 广西林业, 2016(3):1-1. 孙英龙. 非完美信息博弈算法研究与军棋博弈系统设计与实现[d]. 2013. 皮肤性病学(第5版)[m]. 2004. 罗汉超. 喜读《英汉皮肤科学词典》[j]. 中国皮肤性病学杂志, 1990(4). 李军莲. mesh词表的新变化及有关标引规则[j]. 医学信息学杂志, 2007, 28(3):285-28. 马怡梅. his系统在医院管理中的应用研究[d]. ﹀
公开日期：	2022-06-21

面向领域的先进技术侦测关键技术研究.张茜

链接

题名：	面向领域的先进技术侦测关键技术研究
作者：	张茜
学号：	1601210859
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	刘耀
导师单位：	中国科学技术信息研究所
第二导师单位：	北京大学
答辩日期：	2019-05-29
题目(外文)：	Research on Domain-Oriented Advanced Technology Detection
关键字(中文)：	先进技术技术侦测文本挖掘
关键字(外文)：	Advanced Technology Technology Detection Text Mining
文摘：	︿本文主要针对现有技术侦测研究中缺乏先进技术侦测综合模型的问题，利用领域科学文献对先进技术侦测的关键技术进行研究。经过调研发现，在先进技术侦测中，技术点挖掘及其先进性特征在文本中的体现是构建合理有效的先进技术侦测模型的重要任务。因此，本文首先根据先进技术侦测思想方法构建起一个包含领域潜在技术点挖掘和领域潜在先进技术挖掘及其特征发现的侦测模型，之后，对领域先进技术特征建立融合模型以完善侦测模型设计。与此同时，建立针对技术先进性评价的指标体系，最后，在多个技术领域对初始模型进行实验，将实验结果与评价指标进行对比，以优化先进技术侦测模型。具体研究内容有以下几点：首先，本文探讨了不同类型科学文献资源在技术点获取中的特点并根据文献特点制定了相应的技术词获取策略，并且多源文献的特点也为先进技术文本特征的选取提供了依据。同时，也利用科学文献资源建立了领域知识库，领域知识库的概念结构将帮助后续研究更好地挖掘先进性文本特征。本部分提出了针对技术点特点的TFIDFC-value技术字串提取方法，实验证明该方法具有一定有效性。通过该方法获取的领域技术点，将作为领域先进技术挖掘的基本技术词和先进性评价的部分对象。其次，本文选取了技术生命周期、领域技术主题演化、领域科技文本术语、领域专利布局四方面以提取技术先进性文本特征，并假设先进技术位于技术生命周期的萌芽期和成长期，出现在领域技术演化的新主题、领域项目文本中的新术语和领域内大公司的非主流专利布局中，并根据假设和基本技术词，提取可能具有先进性的候选技术词，扩大了技术词获取范围。之后，本文根据相关研究总结归纳出技术先进性评价指标体系，基于此前提取的技术文本特征信息进行融合，融合基于技术成熟度、技术知识扩散等理论。指标体系将用于领域先进技术侦测，与先进技术特征挖掘共同构成了初始先进技术侦测模型。最后，本文选取自动驾驶汽车和物联网领域作为先进技术侦测模型的回溯实验对象，实验证明初始先进技术侦测模型有效，并根据回溯实验结果从提升技术点专业性和单元性角度出发进行侦测模型改进，并将改进后的回溯实验结果与原结果比较分析，实验结果表明，改进后的模型一定程度上提升了排名靠前的候选技术点的先进性侦测准确度。﹀
文摘（外文）：	︿ this paper mainly discusses the key technologies of advanced technology detection in the field of scientific literature. through research and development, it is found that in advanced technology detection, technology point mining, as well as the embodiment of its advanced characteristics in the text, is an important task to build a reasonable and effective advanced technology detection model. therefore, this paper first constructs a detection model including domain potential technology point mining, domain potential advanced technology mining, and feature discovery based on the advanced technology detection theories and methodologies, and establishes a fusion model for domain advanced technology features to improve the initial model design. at the same time, an index system for the evaluation of technological advancement is established. finally, the initial model is tested in many technical fields, and the experimental results are compared with the evaluation index to optimize the detection model of advanced technology. specific research contents are as follows: firstly, this paper discusses the characteristics of different types of scientific literature resources in the acquisition of technical points, and formulates corresponding acquisition strategies of technical terms based on the characteristics of the literature, and the characteristics of multi-source literature also provide a basis for the selection on advanced technical text features. at the same time, the domain knowledge base is established by using scientific literature resources. the conceptual structure of the domain knowledge base will help follow-up research that further excavates advanced text features. in this part, a tfidfc-value string extraction method based on the characteristics of technical points is proposed. experiments show that the method is effective. the domain technology points obtained by this method will be regarded as the basic technical terms of domain advanced technology mining and part of the of advanced evaluation. secondly, this paper selects four aspects of technology life cycle, domain technology theme evolution, domain technology text terminology, and domain patent layout to extract the text characteristics of technology advancement, and assumes that advanced technology lies in the germination and growth of technology life cycle, new topics of domain technology evolution, new terminology of domain project text and non-mainstream specialty of large companies in the domain. according to hypothesis and basic technical words, candidate technical words which may be advanced are extracted in the favorable layout, which enlarges the scope of technical words acquisition. then, according to the relevant research, this paper summarizes the evaluation index system of technological advancement. based on the feature information extracted before, it fuses the theory of technological maturity and diffusion of technological knowledge. the index system will be used in the field of advanced technology detection, and together with the mining of advanced technology features, it will constitute the initial advanced technology detection model. finally, this paper chooses self-driving automobile and internet of things as the backtracking experimental of advanced technology detection model. experiments show that the initial advanced technology detection model is effective, and based on the backtracking results, we improve the detection mode from the perspective of enhancing the expertise and unit nature of the technology points, as well as comparing the improved backtracking results with the original results. to some extent, the improved model improves the advanced detection accuracy of the top candidate technology points. ﹀
分类号：	TP3
论文总页数：	66
参考文献数：	60
参考文献：	︿ [1] 刘剑兰, 朱东华. 信息抽取技术在情报监测中的应用[J]. 情报学报, 2004, 23(6):661-666. [2] 丁俊丽,赵国杰,李光泉.对技术本质认识的历史考察与新界定[J].天津大学学报(社会科学版), 2002,(1):88—92. [3] 闫宏秀.技术进步与价值选择[D].上海：复旦大学,2003.10,31,21. [4] 管晓刚.关于技术本质的哲学释读[J].自然辩证法研究,2001,(12):18—22. [5] Erstling J. International technology transfer and intellectual property rights: Some essentials and options for technology transfer partners[J]. The International Executive, 1992, 34(3): 215-236. [6] 李素建, 王厚峰, 俞士汶等, 关键词自动标引的最大熵模型应用研究, 计算机学报, 2004,9:1192-1197. [7] Watts R J, Porter A L. Innovation forecasting[R]. ARMY TANK-AUTOMOTIVE COMMAND WARREN MI, 1997.45-50. [8] Arthur D Little．The Strategic Management of Technology[M]．Cambridge，Mass, 1981:146-163 [9] 福斯特. 创新:进攻者的优势[M]. 中信出版社, 2008:72-78.. [10] 杰姬·芬恩, 马克·拉斯金诺. 精准创新(第二版)[M]. 中国财富出版社, 2015:86-90. [11] 黄鲁成, 历妍. 基于专利的技术发展趋势评价系统[J].系统管理学报,2010(8):384-388. [12] 于晓勇,赵晓晨,等.基于专利信息分析的我国电动汽车的技术发展趋势研究[J].科学学与科学技术管理,2011(4):45-51. [13] 王燕玲. 面向企业技术创新的专利分析框架研究[J]. 科技管理研究, 2013, 33(5):131-136 [14] Kim J, Hwang M, Jeong D H, et al. Technology trends analysis and forecasting application based on decision tree and statistical feature analysis[J]. Expert Systems with Applications, 2012, 39(16): 12618-12625. [15] KIM J, LEE S, LEE J, et al. Design of TOD Model for Information Analysis and Future Prediction [J]. Communications in Computer and Information Science, 2011, 264: 301-305. [16] Yoon B, Park Y. A text-mining-based patent network: Analytical tool for high-technology trend[J]. Journal of High Technology Management Research, 2004, 15(1):37-50. [17] Shin J, Park Y. Analysis on the dynamic relationship among product attributes: VAR model approach [J]. Journal of High Technology Management Research, 2005, 16(2):225-239. [18] Ozcan S, Islam N. An empirical study of nanowire technological trends[J]. The Journal of High Technology Management Research, 2017, 28(2): 246-260. [19] 赵龙. 基于专利耦合和文本挖掘的技术演化分析——以二氧化碳捕集与存储领域为例[D]. 中国科学技术信息研究所, 2015:8-9. [20] Liu Y, Wang R. RESEARCH ON SEMANTIC METADATA ONLINE AUXILIARY CONSTRUCTION PLATFORM AND KEY TECHNOLOGIES[J]. ICIC Express Letters Part B, 2013,4(4), 897-904 [21] Liu Y, et al. Research on semantic and syntactic analysis of patent literature. ICIC Express Letters, 2016, 10(2):471-477. [22] 刘辉,刘耀. 基于条件随机场的专利术语抽取[J]. 数字图书馆论坛, 2014(12):46-49. [23] 饶慧. 信息抽取技术在情报监测中的应用[J].科技尚品, 2016,(7):153,156. [24] 胡立诺, 胡立岩. 技术检测中的信息抽取技术的应用分析[J]. 价值工程, 2014(21):236-237. [25] 单斌, 李芳. 基于LDA话题演化研究方法综述[J]. 中文信息学报, 2010, 24(6):43-50. [26] Griffiths T L, Jordan M I, Tenenbaum J B, et al. Hierarchical topic models and the nested chinese restaurant process[C]//Advances in neural information processing systems. 2004: 17-24. [27] ROSEN-ZVI M，GRIFFITHS T，STEMVERS M, et al. The author-topic model for authors and documents[C]//Proceedings of the20th conference on uncertainty in artificial intelligence. Arlington: AUAI Press, 2004: 487-494． [28] Steyvers M, Smyth P, Rosen-Zvi M, et al. Probabilistic author-topic models for information discovery[C]//Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2004: 306-315. [29] Rosen-Zvi M, Chemudugunta C, Griffiths T, et al. Learning author-topic models from text corpora[J]. ACM Transactions on Information Systems (TOIS), 2010, 28(1): 4. [30] Blei D M, Lafferty J D. Dynamic topic models[C]//Proceedings of the 23rd international conference on Machine learning. ACM, 2006: 113-120. [31] Liu Y, et al. Research on feature acquisition and key expression technology of knowledge-intensive text. ICIC Express Letters, Part B: Applications, 2014, 5(1): 57-64. [32] 张丽玮, 郑彦宁. 高新技术项目技术风险评估体系构建研究[J]. 科学管理研究, 2014(2):36-39. [33] 刘铭, 姚岳. 企业技术创新绩效评价指标体系研究[J]. 甘肃社会科学, 2014(4):233-236. [34] 程文渊等. 美军重大国防采办项目技术成熟评价的价值分析研究[J]. 科研管理, 2017(s1):71-77. [35] Boer F P. The valuation of technology : business and financial issues in R&D[J]. Research-Technology Management, 1999, 42. [36] Diamantopoulos F, Economides A A. Performance evaluation of power control routing for ad-hoc networks[C]// Wireless Conference 2006 - Enabling Technologies for Wireless Multimedia Communications. VDE, 2006:1-6. [37] 马向阳, 辛荣. 政府视角下以区域联想为核心的区域品牌伞构建研究[J]. 科技进步与对策, 2013(15):46-51. [38] 李雪凤, 仝允桓. 技术价值评估方法的研究思路[J]. 科技进步与对策, 2005, 22(10):75-77. [39] 修国义, 韩佳璇, 陈晓华. 区域创新驱动能力影响因素实证研究[J]. 金融与经济, 2017(5):49-54. [40] 高艳红, 杨建华, 杨帆. 技术先进性评估指标体系构建及评估方法研究[J]. 科技进步与对策, 2013, (5):138-142. [41] 郭俊芳. 基于语义挖掘的技术创新路径分析与评价方法研究[D].北京理工大学,2016:67-96. [42] 曾文, 徐硕, 张运良, 等. 科技文献术语的自动抽取技术研究与分析[J]. 数据分析与知识发现, 2014, 30(1). [43] 邢红兵. 信息领域汉英术语的特征及其在语料中的分布规律[J]. 产品安全与召回, 2000(3):17-21. [44] 张榕. 术语定义抽取、聚类与术语识别研究[D]. 北京语言大学, 2006:23-24 [45] 何燕, 穗志方, 段慧明,等. 一种结合术语部件库的术语提取方法[J]. 计算机工程与应用, 2006, 42(33):4-7. [46] 李嵩. 语言学文献标题的术语提取研究[D]. 山东大学, 2007:13-15. [47] 韩红旗, 朱东华, 汪雪锋. 专利技术术语的抽取方法[J]. 情报学报, 2011, 30(12):1280-1285. [48] 常鹏, 马辉. 高效的短文本主题词抽取方法[J]. 计算机工程与应用, 2011, 47(20):126-128. [49] CHARLES. CLOCKSPEED - WINNING INDUSTRY CONTROL IN AGE OF TEMPORARY ADVANTAGE[J]. Supply Chain Management, 1998, 40(3):104-104. [50] Liu Y, et al. Study on semantic annotation for professional literature[J].ICIC Express Letters, Part B: Applications,2015, 5(5): 1383-1389. [51] .Brin S , Page L . The anatomy of a large-scale hypertextual Web search engine[J]. Computer Networks and ISDN Systems, 1998, 30(1-7):107-117. [52] R. Foster. Boosting the Payoff from R&D [J]. Research Management, 1982, 25: 22-27. [53] Altuntas S , Dereli T , Kusiak A . Forecasting technology success based on patent data[J]. Technological Forecasting and Social Change, 2015, 96:202-214. [54] 王超, 武华维, 赵燕清, et al. 基于创新全过程的知识内容扩散强度分析模型研究[J]. 情报理论与实践, 2018, 41(10):69+141-146. [55] Jarvenpaa, Makinen. An empirical study of the existence of the Hype Cycle: A case of DVD technology[C]// IEEE International Engineering Management Conference. IEEE, 2008. [56] autonomous-vehicles[EB/OL]. https://www.gartner.com/it-glossary/autonomous-vehicles/ [57] 清华大学-中国工程院知识智能联合研究中心. 2018年人工智能之自动驾驶研究报告 [EB/OL]. (2018-7)[2019-04-01]. https://static.aminer.cn/misc/article/selfdriving-new.pdf [58] Benoit Lheureux , W. Roy Schulte , Alfonso Velosa. Hype Cycle for the Internet of Things, 2017 [EB/OL]. (2017-7) [2019-04-01]. https://www.gartner.com/en/documents/3770369 [59] Bouma G. Normalized (pointwise) mutual information in collocation extraction[J]. Proceedings of GSCL, 2009: 31-40. [60] 杜丽萍, 李晓戈, 于根, 等. 基于互信息改进算法的新词发现对中文分词系统改进[J]. 北京大学学报（自然科学版）, 2016, 52(1):35-40 ﹀
公开日期：	2022-06-09

基于层次条件变分自编码器的政府公文自动生成系统的设计与实现.邓雅妮

链接

题名：	基于层次条件变分自编码器的政府公文自动生成系统的设计与实现
姓名：	邓雅妮
学号：	1601210498
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2019-05-29
外文题名：	THE AUTOMATIC GOVERNMENT OFFICIAL DOCUMENT GENERATION BASED ON HIERARCHICAL CONDITIONAL VARIATIONAL AUTOENCODER
关键词：	长短时记忆网络条件变分自编码器关键词抽取政府公文写作文本生成
外文关键词：	LSTM CVAE Keyword extraction Government document Automatic text generation
论文摘要：	︿近年来，文本生成是自然语言处理领域（Natural Language Processing）一项极具挑战的任务，在解决短文本生成和诗歌生成等方面都取得了不错的进展，但由于当文本变长会造成信息丢失、误差传递和错误偏移等问题，因此在长文本生成上的研究还处于初步阶段，特别是中文长文本生成，而政府公文的生成又是中文长文本生成中特殊的一种。政府公文是我国传达政治任务、表达政治观点以及记录历史事件的特殊文化遗产，有着独特的行文思路和措辞特点，其生成任务所面临的难点与长文本生成有诸多共通之处，都希望生成具有用词多样性（wording diversity）和主题一致性（thematic consistency）的文本。用词多样指句中使用了多样化的词语来表词达意，而不是重复地使用单调的字或词；主题一致指文本句与句之间和句子内部词与词之间阐述的为同一主题。随着深度学习方法的普及，在文本生成中seq2seq是一种常用的高质量文本生成框架。VAE的引入可以使得seq2seq的生成过程更具多样性，同时学者们发现将生成条件引入VAE中构成CVAE，可以进一步提高句子的内部主题一致性和句子用词多样性。在近期的工作中，关键词也被证实可以作为中间的生成结果来进一步提高句子与句子之间的主题一致性。虽然CVAEs等已被证实可以用来进行文本生成，但是它们的生成指向性不足，并且不能很好地保证主题一致性以及生成更加多样化的用词。本文试图通过加入类似写作提纲的关键词得到Key-CVAE，使得模型在生成中文政府公文的过程中不仅可以考虑词和词的主题一致性还能进一步优化句子与句子之间的主题一致性。实验表明，本文模型Key-CVAE不仅在本文构建的政府公文数据集上在篇章和句中主题一致性上取得了高于预期的效果，并且在一系列对比实验中验证了关键词和CVAE的结合不仅加强了CVAE的主题一致性，还保持了用词多样性的性能，同时验证了训练数据集的多样性对模型生成结果的影响。目前，虽然长文本生成技术在中文任务上只是初期探索阶段，但本文引入的Key-CVAE模型具有很好的参考研究价值，为以后长文本生成任务的研究提供了新的思路。﹀
外文摘要：	︿ Text generation is a challenging task in Natural Language Processing(NLP). Although text generation has achieved success in many fields such as Short-text generation and Poetry generation. But, when the text becomes longer, it will cause problems such as information loss, error transmission, error migration, etc. Therefore, the research on Long-text generation is still in its preliminary stage, especially in the Chinese Long-text generation, such as the Government document generation which is a special kind of Long-text generation. Government document is a unique cultural heritage with its special use and combination of words. Aiming to publish political tasks, express political views and note historical events. However, the challenges government documents face have much in common with traditional texts, like wording diversity and thematic consistency. Wording diversity highlights the type of words used and thematic consistency emphasized the consistency of theme between sentences and words. With the popularity of deep learning methods, Seq2Seq is a commonly used and a high-quality text generation framework in text generation. The use of VAE can make the generation process of seq2seq more diverse. Scholars have also found that the generation conditions of CVAE can further improve the VAE’s text generation in internal theme consistency and wording diversity. In recent work, keywords serve as intermediate generation results have also been shown can further improve the topical consistency between sentence and sentence. Although CVAEs have been proven to be useful for text generation, but their generation is not sufficiently directed. This paper attempts to propose the model: keyword-enhanced conditional variation autoencoder (Key-CVAE) to solve the problem of Chinese government document generation by adding the keywords as writing outline in the consistency of theme between sentences and words. Experiments have shown that the model Key-CVAE not only achieves higher-than-expected effect on the theme consistency in the government document data set constructed in this paper, but also proved that the combination of keywords and CVAE not only enhanced the theme consistency of the CVAE model, but also maintained the performance of it’s wording diversity, and verified the diversity of the training data set have an impact on the model generation, in a series of comparative experiments. Although Long-text generation is in the preliminary stage in chinese tasks, but the Key-CVAE model introduced in this paper has reference research value which provides a new idea for the research of Long-text generation tasks. ﹀
分类号：	TP3
论文总页数：	62
参考文献总数：	50
参考文献列表：	︿ [1]夏海波.公文写作与处理[m].北京大学出版社.(20110401) [2]车颖.政府办公室文件规范化管理的路径探索[j].办公室业务.(201810) [3]王兆胜.政府公文写作特征与提高途径[j].办公室业务.(2017) [4]蒋锐滢，崔磊，何晶，周明，潘志庚.基于主题模型和统计机器翻译方法的中文格律诗生成[j].计算机学报.(2015) [5]覃江华.政府公文的语篇特征与汉译英技巧[n].重庆交通大学学报(社科版).(201206) [6]赵宇晴,向阳.基于分层编码的深度增强学习对话生成.计算机应用.(201710) [7]boyang ding, quan wang, bin wang, li guo.improving knowledge graph embedding using simple constraints. arxiv preprint arxiv:1805.02408v2.(2018) [8]martin l j, ammanabrolu p, wang x, et al. event representations for automated story generation with deep neural nets[j]. arxiv preprint arxiv:1706.01331.(2017) [9]sutskever, i., vinyals, o., & le, q. v.sequence to sequence learning with neural networks. in advances in neural information processing systems (pp. 3104-3112).(2014). [10]bowman, s. r., vilnis, l., vinyals, o., dai, a. m., jozefowicz, r., & bengio, s.generating sentences from a continuous space. arxiv preprint arxiv:1511.06349.(2015) [11]doersch, c.tutorial on variational autoencoders. arxiv preprint arxiv:1606.05908.(2016) [12]ian goodfellow, jean pouget-abadie, mehdi mirza, bing xu, david warde-farley, sherjil ozair, aaron courville, and yoshua bengio.generative adversarial nets. in advances in neural information processing systems. montreal, canada, pages 2672–2680.(2014) [13]zhao, t., zhao, r., & eskenazi, m.learning discourse-level diversity for neural dialog models using conditional variational autoencoders. arxiv preprint arxiv:1703.10960.(2017) [14]li, j., song, y., zhang, h., chen, d., shi, s., zhao, d., & yan, r.generating classical chinese poems via conditional variational autoencoder and adversarial training. in proceedings of the 2018 conference on empirical methods in natural language processing (pp. 3890-3900).(2018). [15]hochreiter, s., & schmidhuber, j.long short-term memory. neural computation, 9(8),1735-1780.(1997) [16]xu, j., zhang, y., zeng, q., ren, x., cai, x., & sun, x.a skeleton-based model for promoting coherence among sentences in narrative story generation. arxiv preprint arxiv:1808.06945.(2018) [17]fan, a., lewis, m., & dauphin, y.hierarchical neural story generation. arxiv preprint arxiv:1805.04833.(2018) [18]wang, z., he, w., wu, h., wu, h., li, w., wang, h., & chen, e.chinese poetry generation with planning based neural network. arxiv preprint arxiv:1610.09889. (2016) [19]yang, x., lin, x., suo, s., & li, m.generating thematic chinese poetry using conditional variational autoencoders with hybrid decoders. arxiv preprint arxiv:1711.07632.(2017) [20]kishore papineni, salim roukos, todd ward, and wei-jing zhu .bleu: a method for automatic evaluation of machine translation. in proceedings of the 40th annual meeting of the association for computational linguistics (acl), philadelphia, july 2002(pp. 311-318).(2002) [21]swanson, r., and gordon, a. s.say anything: using textual case-based reasoning to enable open-domain interactive storytelling. acm transactions on interactive intelligent systems (tiis) 2(3):16.(2012) [22]tu, z.; liu, y.; shi, s.; and zhang, t.learning to remember translation history with a continuous cache. arxiv preprint arxiv:1711.09367.(2017) [23]wang, x; chen, w.; wang, y.f.; and wang, w. y.no metrics are perfect: adversarial reward learning for visual storytelling. arxiv preprint arxiv:1804.09160.(2018) [24]wiseman, s.; shieber, s.; and rush, a.challenges in data-to-document generation. in emnlp, 2253–2263.(2017) [25]yan, x.; yang, j.; sohn, k.; and lee, h.at- tribute2image: conditional image generation from visual attributes. in european conference on computer vision, 776–791. springer.(2016) [26]zhang, j.; feng, y.; wang, d.; wang, y.; abel, a.; zhang, s.; and zhang, a.flexible and creative chinese poetry generation using neural memory. in acl, volume 1, 1364–1373.(2017) [27]lantao yu, weinan zhang, jun wang, yong yu.seqgan: sequence generative adversarial nets with policy gradient. arxiv preprint arxiv: 1609.05473v6.(2017) [28]jain, parag，agrawal, priyanka，mishra, abhijit.story generation from sequence of independent short deions. arxiv preprint arxiv:1707.05501. [29]li, j.; luong, m.-t.; and jurafsky, d.a hierarchical neural autoencoder for paragraphs and documents. in acl, volume 1, 1106–1115.(2015) [30]martin, l. j.; ammanabrolu, p.; hancock, w.; singh, s.; harrison, b.; and riedl, m. o.event representations for automated story generation with deep neural nets. arxiv preprint arxiv:1706.01331.(2017) [31]may, n. p. m. g. j., and knight, k.towards controllable story generation. naacl workshop.(2018) [32]jingjing xu, xuancheng ren, junyang lin, and xu sun.diversity-promoting gan: a cross-entropy based generative adversarial network for diversified text generation. in emnlp.(2018) [33]jingjing xu, xu sun, qi zeng, xuancheng ren, xiaodong zhang, houfeng wang, and wenjie li.unpaired sentiment-to-sentiment translation: a cycled reinforcement learning approach. in acl.(2018) [34]brent harrison, christopher purdy, and mark o riedl. toward automated story generation with markov chain monte carlo methods and deep neural networks. in proceedings of the 2017 workshop on intelligent narrative technologies.(2017) [35]boyang li, stephen lee-urban, george johnston, and mark o. riedl. story generation with crowdsourced plot graphs.(2013) [36]erica greene, tugba bodrumlu, kevin knight. automatic analysis of rhythmic poetry with applications to generation and translation.proceedings of the 2010 conference on empirical methods in natural language processing, emnlp 2010, 9-11 october 2010, mit stata center,massachusetts, usa, a meeting of sigdat, a special interest group of the acl.(2010) [37]hu z, yang z, liang x, et al. toward controlled generation of text[j].(2018) [38]iulian vlad serban, alessandro sordoni, ryan lowe, laurent charlin, joelle pineau, aaron courville,yoshua bengio.a hierarchical latent variable encoder-decoder model for generating dialogues.arxiv preprint arxiv:1605.06069v3.(2016) [39]jiwei li1,michel galley,chris brockett,jianfeng gao,bill dolan.a diversity-promoting ive function for neural conversation models. arxiv preprint arxiv: 1510.03055v3.(2016) [40]juntao li, lidong bing, lisong qiu, dongmin chen, dongyan zhao and rui yan. learning to write creative stories with thematic consistency. in aaai.(2019) [41] zhao, tiancheng，zhao, ran，eskenazi, maxine.learning discourse-level diversity for neural dialog models using conditional variational autoencoders. 10.18653/v1/p17-1061.(2017) [42]zhiting hu, zichao yang, xiaodan liang, ruslan salakhutdinov, eric p. xing.toward controlled generation of text. arxiv preprint arxiv:1703.00955v4.(2018) [43]xiaopeng yang, xiaowen lin, shunda suo, ming li.generating thematic chinese poetry with conditional variational autoencoder. arxiv preprint arxiv:1711.07632v1.(2018) [44]jianmin bao, dong chen, fang wen, houqiang li, gang hua.cvae-gan: fine-grained image generation through asymmetric training. arxiv preprint arxiv: 1703.10155v2.(2017) [45]pengfei liu, xipeng qiu, xuanjing huang.recurrent neural network for text classification with multi-task learning. proceedings of the twenty-fifth international joint conference on artificial intelligence (ijcai-16) [46]zachary c,lipton.a critical review of recurrent neural networks for sequence learning.arxiv preprint arxiv: 1506.00019v1.(2015) [47]lili yao,nanyun peng,ralph weischedel,kevin knight,dongyan zhao,rui yan.plan-and-write: towards better automatic storytelling .aaai.(2019) [48]jiyuan zhang, yang feng, dong wang, yang wang, andrew abel, shiyue zhang, andi zhang.flexible and creative chinese poetry generation using neural memory. arxiv preprint arxiv:1705.03773.(2017) [49]rada, mihalcea, and paul tarau.textrank: bringing order into texts.empirical methods in natural language processing. (2004) [50]bing liu,philip s. yu.the top ten algorithms in data mining.chapter 6.(1998) ﹀
公开日期：	2019-06-17

一种英语写作知识点推荐策略.Tianfang Gao

链接

题名：	一种英语写作知识点推荐策略
姓名：	Tianfang Gao
学号：	1601210521
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2019-05-29
外文题名：	A Strategy for Recommending English Writing Knowledge Points
关键词：	英语写作知识点推荐策略制定
外文关键词：	English Writing Knowledge Points Recommendation Strategy Making
论文摘要：	︿无论在考试还是在日常生活中，英语写作都是中国学生不可回避的难题。目前，市面上虽然存在包括书籍与网站在内的各种英语写作教学资源，但这些资源的作用相对有限。相关书籍只是理论和语料资源的搜集整合；而写作辅助系统为学生的文章提供分数判定和错误批改，并没有结合学生的写作水平和写作目的针对性地在写后阶段帮助学生获得写作能力的提升。英语写作教学理论繁多，在众多英语写作教学理论中，让学生通过不断改写文章来提升写作能力是广受认可的一种方案。本系统根据此教学理念设计。为了得到适合每个学生的文章改写方向，本系统充分考量每个学生的写作记录，设计并实现了一种英语写作知识点推荐策略。该策略在运行时可以修正自己的反馈以适应对象的变化。对于英语写作教学，该策略的目的是为学生推荐出其最需要学习的内容以提升其写作能力，本文将这些内容定义为学生使用次数较少而母语者使用次数较多的知识点。对于中国学生在英语写作时遇到的无词可用、难以连词成句、表达不够地道等问题，本文提出了解决方案并对方案进行了验证。使用系统时，学生输入一篇自己的习作，系统对学生该篇习作和学生的写作历史进行单词、搭配、短语和句型四个维度英语知识点使用频率的计算，对比学生文章与范文中各个知识点的使用频率得出推荐的知识点。在单词推荐模块，为学生推荐使用频率小于同主题范文的单词集合。在搭配、语块和句型推荐模块，对于学生没有使用过的知识点，为其推荐同主题范文中使用频率最高的知识点；对于学生使用过的知识点，为其推荐同主题范文中使用频率/学生使用频率最高的知识点。当学生历史文章数量不足时，通过级别判定和主题约束模块来选取替代文章，使用中国英语学习者语料（Chinese Learner English Corpus，CLEC）作为学生历史文章的替代语料。测试显示，本系统推荐的知识点可以有效提升写作者的写作表现，知识点的有用性得到了被试者的一致认可。在系统测评时，首先邀请五位英语水平较高的人员进行了英语作文写作。让五位受邀者使用系统改写自己的文章，对改写前后的五篇文章分别进行了专家评分和计算机自动评分，并邀请写作者对系统推荐的知识点做了打分。结果显示，在采纳系统推荐的知识点进行文章修改后，人工评分平均提高了6.3%（满分9分制，平均提高0.568分），机器评分平均提高了0.5%（满分100分制，平均提高0.5分）。在对推荐的知识点进行人工评分时，在满分5分制下，单词推荐结果的平均人工评分为3.65，搭配推荐结果的平均人工评分为2.3，语块推荐结果的平均人工评分为3.63，句型推荐结果的平均人工评分为2.8。为了验证系统对英语水平一般的英语写作者的作用，笔者从CLEC语料库中随机抽取5篇文章，并邀请五位学生对这五篇文章进行严格基于系统推荐的知识点的改写，在修改后，人工评分平均提高了7.4%（满分9分制，平均提高0.67分），机器评分平均提高了8.5%（满分100分制，平均提高8.5分）。﹀
外文摘要：	︿ english writing is a major obstacle for chinese students either in exams or in daily life. although various books and websites concerning english writing exist, most of the books are simply the display of corpus resources while the auxiliary websites do nothing more than examing and grading student`s articles. these tools lack the individualized writing guidance which is key to the advance of students` writing ability. among diverse teaching theories of english writing, one of the most recognized is improvement through rewriting, based on which this system is designed and developed. to get the suitable rewriting direction for each student, the system provides the users with knowledge points of words, collocations, chunks and sentence patterns. this system adjusts its feedback according to the . the main goal of an intelligent auxiliary writing platform is to boost students` writing ability through recommending suitable knowledge points. the knowledge points recommended in this project are those which chinese students rarely use while native writers use a lot. this system aims to tackle the common problems chinese students have such as lost in words, incapable of connecting words into strong expressions and idiomatically insufficient. this thesis will give solution to these problems and verify the system`s effect. to use the system, students need to input an article, the system will then calculate and compare the usage frequency of different words, collocations, chunks and sentence patterns in user`s articles with that in native writer`s articles to decide which ones to recommend. when recommending words, the system picks those which students use less than native writers. when collocations, chunks and sentence patterns are selected, the system divides the strategy into two scenarios. if a user has never used certain knowledge points, those knowledge points who have the largest usage frequency in model essays are selected. for those knowledge points which have been used before by the user, the system calculates the value of usage frequency in model articles divided by usage frequency in students` articles and the knowledge point with the largest value is recommended. when there are not enough history articles of a user, the system employs level determination module and genre definition module on chinese learner english corpus (clec) for substitution. the system is proved valid in promoting users`writing ability and knowledge points recommended are approved by the users. evaluation was done first by tracking the artificial scores (given by two english experts) and machine scores (given by pigaiwang, a website embedded with grading module) of five articles written by students. results show that after using the system, artificial scores increase by 6.3% (rose by 0.5 points of a possible 9) on average, machine scores increase by 0.5% (rose by 0.5 points of a possible 100) on average. when asked to evaluate the knowledge points recommended by the system with full mark of 5, the five writers scored the word recommendation module of 3.65, the collocation recommendation module of 2.3, the chunk recommendation module of and the sentence pattern module of 2.8. due to the fact that five writers invited by the author generally have high english writing levels. in order to test the system`s effect on average students, the author randomly extracted five clec articles and invite five students to rewrite them only using the knowledge points recommended by the system. after modification, artificial scores increase by 7.4% (rose by 0.67 points of a possible 9), machine scores increase by 8.5% (rose by 8.5 points of a possible 100). ﹀
分类号：	H0-0
论文总页数：	64
参考文献总数：	58
参考文献列表：	︿ [1] friedman t l. the world is flat: a brief history of the twenty-first century.[j]. international journal, 2007, 9(1):67-69. [2]《北京日报》 2012 北京外语人口十年增长近一倍，6月17日 [3] center d c. graduate record examination[m]. 1988. [4] a snapshot of the individuals who took the gre revised general test [5] vygotsky l s. mind in society: the development of higher psychological processes[m]. harvard university press, 1980. [6] schmidt r, frota s n. developing basic conversational ability in a second language: a case study of an adult learner of portuguese[c]// r day talking to learn: conversation in second language acquisition rowley. 1980. [7] ellis r. interpretation tasks for grammar teaching[j]. tesol quarterly, 1995, 29(1): 87-105. [8] 梁彪. 面向英语智能学习的知识库系统的设计与实现 2018 [9] 赵恩辉. 英语智能写作个性化辅助系统的设计与实现 2018 [10] cotos e. designing an intelligent discourse evaluation tool: theoretical, empirical, and technological considerations[j]. developing and evaluating language learning materials, 2009: 103-127. [11]徐昉. 英语写作教学与研究[m]. 外语教学与研究出版社, 2012. [12] kepner, goring c . an experiment in the relationship of types of written feedback to the development of second-language writing skills[j]. the modern language journal, 1991, 75(3):305-313. [13] wray a. formulaic language and the lexicon[m]. cambridge university press, 2005. [14] 王立非, 张岩. 基于语料库的大学生英语议论文中的语块使用模式研究[j]. 外语电化教学, 2006(4):36-41. [15] 郭晓英, 毛红梅. 语块教学对英语写作能力影响的实验研究[j]. 山东外语教学, 2010, 31(3):52-59. [16] bloom b s. the 2 sigma problem: the search for methods of group instruction as effective as one-to-one tutoring[j]. educational researcher, 1984, 13(6):4-16. [17] shute v j, psotka j. intelligent tutoring systems: past, present, and future[j]. handbook of research for educational communications and, 2002, 39(12):68. [18] kingsbury g g, weiss d j. 13. a comparison of irt-based adaptive mastery testing and a sequential mastery testing procedure[m]// new horizons in testing. 1983:257-283. [19] wainer h. computerized adaptive testing : a primer[m]. l. erlbaum associates, 2000. [20] millán e, pérez-de-la-cruz j l. a bayesian diagnostic algorithm for student modeling and its evaluation[j]. user modeling and user-adapted interaction, 2002, 12(2-3):281-330. [21] chung g k w k, o'neil jr h f. methodological approaches to online scoring of essays[j]. 1997. [22] attali y, burstein j. automated essay scoring with e‐rater®; v.2.0[j]. journal of technology learning & assessment, 2006, 4(2):i–21. [23] pawley, a. and syder, f.h. (1983) two puzzles for linguistic theory: native-like selection and native-like fluency. in: richards, j.c. and schmidt, r.w., eds., language and communi-cation, longman, new york: 191-226 [24] attali y. exploring the feedback and revision features of criterion[j]. national council on measurement in education (ncme), educational testing service, princeton, nj, 2004. [25] cowie a p. multiword lexical units and communicative language teaching[m]// vocabulary and applied linguistics. 1992. [26] 何克抗. 建构主义的教学模式、教学方法与教学设计[j]. 北京师范大学学报(社会科学版), 1997(5):74-81. [27] fernandez-delgado m , cernadas e , barro s , et al. do we need hundreds of classifiers to solve real world classification problems?[j]. journal of machine learning research, 2014, 15:3133-3181. [28] vapnik, v.n. and lerner, a.y., 1963. recognition of patterns with help of generalized portraits. avtomat. i telemekh, 24(6), pp.774-780. [29] e.t.jaynes. 概率论沉思录[m]. 2009. [30] 李航．统计学习方法．北京：清华大学出版社，2012 [31] david m. blei, andrew y. ng, and michael i. jordan. latent dirichlet allocation. j. mach. learn. res.,3:993–1022, march 2003. [32] mikolov t, sutskever i, chen k, et al. distributed representations of words and phrases and their compositionality[c]// international conference on neural information processing systems. curran associates inc. 2013:3111-3119. [33] 姜柄圭. 面向机器辅助翻译的汉语语块自动抽取研究[j].中文信息学报 2007 [34] nagao m, mori s. a new method of n-gram statistics for large number of n and automatic extraction of words and phrases from large text data of japanese[j]. proc.intern.conf.on computational linguistics, 1994, 1:611-615. [35] 吕学强. 基于散列技术的快速子串归并算法[j]. 复旦学报（自然科学版）, 2004, 43(5):948-951.[34] 谌贻荣. 中文术语自动提取技术研究 2005. [36] 罗盛芬, 孙茂松. 基于字串内部结合紧密度的汉语自动抽词实验研究[j]. 中文信息学报, 2003, 17(3). [37] shannon c e . a mathematical theory of communication[j]. bell labs technical journal, 1948, 27(4):379-423. [38] 佚名. 超奇迹分类记18000英语单词[m]// 超奇迹分类记18000英语单词. 2015. [39] 韦晓亮, 刘剑. 雅思写作论证论据素材大全[m]. 浙江教育出版社, 2012. [40] 佚名. 雅思词组必备[m]. 2012. [41] 桂诗春, 杨惠中. 中国学习者英语语料库[m]. 上海外语教育出版社, 2003. [42] röder m, both a, hinneburg a. exploring the space of topic coherence measures[c]//proceedings of the eighth acm international conference on web search and data mining. acm, 2015: 399-408. [43] dan klein and christopher d. manning. 2003. accurate unlexicalized parsing. proceedings of the 41st meeting of the association for computational linguistics, pp. 423-430. [44] petrov s , barrett l , thibaux r , et al. learning accurate, compact, and interpretable tree annotation[c]// international conference on computational linguistics & the meeting of the association for computational linguistics. association for computational linguistics, 2006. [45] 项炜, 金澎. 大规模语料库上的stanford和berkeley句法分析器性能对比分析[j]. 电脑知识与技术, 2013(8). [46] fellbaum c, miller g. combining local context and wordnet similarity for word sense identification[c]// 1998. [47] 吕学强, 张乐, 黄志丹,等. 基于散列技术的快速子串归并算法[j]. 复旦学报(自然科学版), 2004, 43(5):948-951. [48] quirk, c., c. brockett, and w. b. dolan. 2004. monolingual machine translation for paraphrase generation, in proceedings of the 2004 conference on empirical methods in natural language processing, barcelona spain. [49] dolan w. b., c. quirk, and c. brockett. 2004. unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. coling 2004, geneva, switzerland. [50] jeffrey pennington, richard socher, and christopher d. manning. 2014. glove: global vectors for word representation. [51] frederiksen j r. a componential theory of reading skills and their interactions[j]. 1982. [52] william grabe. current developments in second language reading research[j]. tesol quarterly, 1991, 25(3):375-406. [53] nation p, waring r. vocabulary size, text coverage and word lists[j]. vocabulary: deion, acquisition and pedagogy, 1997, 14: 6-19. [54] laufer b. the lexical profile of second language writing: does it change over time?[j]. relc journal, 1994, 25(2): 21-33. [55] 崔艳嫣, 王同顺. 接受性词汇量、产出性词汇量与词汇深度知识的发展路径及其相关性研究[j]. 现代外语, 2006, 29(4):392-400. [56] engber c a. the relationship of lexical proficiency to the quality of esl compositions[j]. journal of second language writing, 1995, 4(2): 139-155. [57] 秦晓晴. 中国大学生英语写作能力发展规律与特点研究[m]. 中国社会科学出版社, 2007. [58] 鲍贵. 二语写作中的词汇应用能力研究[m]. 外语教学与研究出版社, 2008. ﹀
公开日期：	2019-06-04

富信息古籍整理平台的设计与研究.刘晓娟

链接

题名：	富信息古籍整理平台的设计与研究
姓名：	刘晓娟
学号：	1601210635
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2019-05-29
外文题名：	Research and Design of an Information-Rich Ancient Books Collation System
关键词：	古籍整理古籍数字化校勘富信息
外文关键词：	Ancient books collation Ancient books digitalization Emendation Information-rich
论文摘要：	︿古籍是辛亥革命以前传抄或刻印的历史典籍等资源的统称，具有较高的文物价值和文化意义。但因年代久远，善本难存。为了恢复古籍的原本样貌，古籍工作者需要进行辑佚、校勘、注释、标点等整理工作，以探求古籍原本样貌，便于后人阅读研究。为了实现古籍资源共享，数字化是必由之路。如何借助先进的信息技术，提升古籍整理效率，解决数字化过程中存在的问题是当务之急。通过分析古籍整理的研究现状，可将问题总结为如下四点：一、古籍整理缺少功能完整、流程完善的开放平台；二、缺乏统一规范的整理流程，整理工作欠缺指导；三、仅重视整理的结果，丢失整理的过程信息。四、专业性较强的整理工作缺少专家参与，整理质量参差不齐。为解决上述问题，以提供便捷、完整、高效的古籍整理系统为目标，结合古籍整理的特点和原则，笔者创新性地提出了重视古籍整理过程的思路，并完成了多层次、可追溯的古籍整理平台产品设计，为整理者提供了高效的工作环境。平台的优势可总结如下：一、包含完整的工作流程，平台将版本选择、文本录入、内容整理等重点工作囊括在内。在系统的指引下，整理者可通过一个平台完成整理任务，减少不同工具之间的切换。二、重视整理过程，将系统分成不同的工作层次，借助富信息的设计，保存每层的校改信息，根据存储的数据追溯整理过程，出现问题便于定位，及时更改，也为研究提供了支持。三、区分专家和普通整理者角色，分别匹配不同的整理任务，确保参与者能胜任整理工作，产出符合要求的整理成果；同时，系统分为多个层次，可在每个层次审查整理结果，保证整理质量。在北京大学儒藏古籍整理专家的指导下，笔者通过整理工作的典型场景应用范例，对本研究设计的整理平台进行了验证。工作成果得到了古籍专家的肯定，证明了富信息整理平台可以为古籍整理工作提供便利，提高整理结果的可信度。﹀
外文摘要：	︿ Ancient books refer to the historical books written or published before the 1911 Revolution. Ancient books are the carriers of Chinese culture. Hundreds of years has passed since these books first came out, so they are inevitably suffered loss. To restore the original appearance of these historical books for reading and researching, it is necessary for the specialists to collate, add punctuation and notes, etc. On the purpose of widely sharing resources of ancient books, digitalization is the only way. In the past, scholars continually did research work with outdated tools. But nowadays, information technology brings more possibility to collation work. The collation work for historical books is developing through time. Though computer-aided collation systems help improve the efficiency of the work, there are still many problems waiting to be solved in collation practice as following listed: Firstly, lack of open and integrated platform for collation work. Secondly, deficient in standard workflow and guidance. Thirdly, loss high value information due to the emphasis on the work result rather than the work process, workflow can not be traced back. Finally yet importantly, quality issues on collation work. To solve the problems mentioned above, the author proposed a novel idea of valuing collation process. Under the guidance of this idea, the paper designed an ancient books collation system with the following advantages: Firstly, the system guides users to finish the whole process of collation work. Secondly, the system is designed by the guidance of emphasis on collation process, using XML file to record the data of collation process, which makes tracing the workflow back possible. Thirdly, the design of both user-work differentiated and multiple verification based on multi-layer design offers a guarantee of quality. This paper has designed a specific collation scenario to validate the design of Information-Rich Ancient Books Collation System. Experts from the Ru Cang Compiling and Editing Center of Peking University had given their recognition. The interviews proved that this study and design could effectively lighten the burdens of collation work and improve the working efficiency to this field. ﹀
分类号：	TP3
论文总页数：	118
参考文献总数：	76
参考文献列表：	︿ [1] 周娟. 故宫文创对中国元素的运用研究[j]. 视听, 2018, no.130(02):198-199. [2] 曹亦冰. 从中国大陆当代古籍整理的现状看其类别、方式方法及走向[j]. 古籍整理研究学刊, 2005(1):1-7. [3] 杨牧之. 新中国古籍整理出版工作回顾与展望[j]. 中国出版史研究, 2018(1). [4] 曹之. 中国古籍版本学. 第 3 版. 武汉: 武汉大学出版社. 2015. [5] 毛建军. 古籍数字化的概念与内涵[j]. 图书馆理论与实践, 2007(4). [6] 刘琳, 吴洪泽. 古籍整理学. 成都: 四川大学出版社. 2003. [7] 李明杰, 俞优优. 中文古籍数字化的主体构成及协作机制初探[j]. 图书与情报, 2010(1):34-44. [8] 曹玲. 农业古籍数字化建设实践. 芜湖: 安徽师范大学出版社. 2017. [9] 许逸民. 古籍整理釋例. 北京: 中华书局. 2011. [10] 肖珑, 苏品红, 刘大军. 国家图书馆古籍元数据规范与著录规则. 北京: 国家图书馆出版社. 2014. [11] 黄永年. 古籍整理概论. 西安: 陕西人民出版社. 1985. [12] 时永乐. 古籍整理教程. 保定: 河北大学出版社. 1997. [13] 曹林娣. 古籍整理概论. 北京: 北京大学出版社, 2007. [14] 刘琳, 吴洪泽. 古籍整理学. 成都: 四川大学出版社. 2003. [15] 陈力. 中文古籍数字化方法之检讨[j]. 国家图书馆学刊, 2005, 14(3). [16] 董洪利. 古典文献学基础. 北京: 北京大学出版社. 2008. [17] 管锡华. 校勘學教程. 北京: 北京大学出版社. 2013. [18] 胡适. 校勘學方法論: 序陳垣先生的元典章校補釋例. 北京: 出版者不详. 1934. [19] 陈桓. 校勘学释例. 北京: 中华书局. 1959. [20] 刘伟红. 中文古籍数字化的现状与意义[j]. 图书与情报, 2009(4):134-137. [21] 彭江岸. 论古籍的数字化[j]. 河南图书馆学刊, 2000, 20(2):63-65. [22] 王桂平. 我国古籍数字化的现状及展望[j]. 图书情报知识, 2000(4):50-1. [23] 李运富. 谈古籍电子版的保真原则和整理原则[j]. 古籍整理研究学刊, 2000(1):1-7. [24] 张尚英. 古籍电子化问题探析[j]. 安徽师范大学学报（人文社科版）, 2002, 30(2):244-248. [25] 李明杰. 中文古籍数字化基本理论问题刍议[j]. 图书馆论坛, 2005, 25(5):97-100. [26] 李国新. 中国古籍资源数字化的进展与任务[j]. 大学图书馆学报, 2002, 20(1). [27] 陈力. 中国古籍数字化的现状与展望[j]. 古籍整理出版情况简报, 2004, 4. [28] 孙慧云 . 基于专利分析的古籍数字化技术演进研究[j]. 山东图书馆学刊, 2018. [29] 崔雷. 中文古籍数字化研究 [d]. 吉林大学. 2010. [30] 王立清. 中文古籍数字化研究. 北京: 国家图书馆出版社. 2011. [31] 曹霞, 常存库, 裴丽. 中医古籍数字化建设及其平台设计和实现[j]. 中华医学图书情报杂志, 2016, 25(3):45-47. [32] long x, ling c. designing and implementation of chinese metadata standards: a case study on metadata applications in peking university rare book digital library[c]//global digital library development in the new millennium—fertile ground for distributed cross-disciplinary collaboration: proceedings of the 12 th international conference on new information technology. beijing: tsinghua university library, may 29-31. 2001. [33] 孟忻. “中华字库”工程——中华民族有史以来规模最大的汉字及少数民族文字整理工作[j]. 中国索引, 2013(1):43-44. [34] 张翼飞. 古籍数字化中的字符集问题与解决方案[j]. 出版发行研究, 2016(3):77-80. [35] 张轴材, 朱岩. 大规模文献数字化的实践与数字图书馆建设[j]. 高校文献信息研究, 2001(1):6-10. [36] 馬德偉. tei 使用指南--運用 tei 處理中文文獻. 台北: 出版者不详. 2009 [37] 王魁生. 计算机支持协同设计系统的研究与实现[d]. 西安交通大学. 2001. [38] de lima y o , de souza j m . [ieee 2017 ieee 21st international conference on computer supported cooperative work in design (cscwd) - wellington, new zealand (2017.4.26-2017.4.28)] 2017 ieee 21st international conference on computer supported cooperative work in design (cscwd) - the future of work: insights for cscw[c]// ieee international conference on computer supported cooperative work in design. ieee, 2017:42-47. [39] papangelis k, potena d, smari ww, et al. advanced technologies and systems for collaboration and computer supported cooperative work. future generation computer systems. 2019;95:764-74. [40] kittur a, nickerson j v, bernstein m, et al. the future of crowd work[c]//proceedings of the 2013 conference on computer supported cooperative work. acm, 2013: 1301-1318. [41] tobias hoßfeld, hirth m , tran-gia p . modeling of crowdsourcing platforms and granularity of work organization in future internet[c]// teletraffic congress. ieee, 2011. [42] choroś k, jarosz j. most frequent errors in digitization of polish ancient manus[c]//asian conference on intelligent information and database systems. springer, cham, 2018: 170-179. [43] 辛睿龙, 王雅坤. 古籍数字化中汉字处理的现状、问题及策略[j]. 图书馆理论与实践, 2017(9). [44] 李书宁, 曾姗. 国外图书馆数字馆藏众包建设实践调查与分析[j]. 图书情报工作, 2014, 58(23):83-90. [45] 赵阳, 顾磊. 基于中文信息处理的古籍整理研究评述[j]. 图书情报工作, 2010, 54(3). [46] robertson b, boschetti f. large-scale optical character recognition of ancient greek[j]. mouseion, 2017, 14(3): 341-359. [47] grana c, serra g, manfredi m, et al. layout analysis and content enrichment of digitized books[j]. multimedia tools and applications, 2016, 75(7): 3879-3900. [48] mehri m, gomez-kramer p, heroux p, et al. a texture-based pixel labeling approach for historical books [j]. 2017, 20(2): 325-64. [49] mehri m, héroux p, lerouge j, et al. page retrieval system in digitized historical books based on error-tolerant subgraph matching[c]//2017 14th iapr international conference on document analysis and recognition (icdar). ieee, 2017, 1: 1168-1173. [50] 常娥, 侯汉清, 曹玲. 古籍自动校勘的研究和实现[j]. 中文信息学报, 2007, 21(2):83-88. [51] 黄建年. 古籍计算机自动断句标点与自动分词标引研究. 安徽师范大学出版社, 2013. [52] 张开旭, 夏云庆, 宇航. 基于条件随机场的古汉语自动断句与标点方法 [j]. 清华大学学报(自然科学版), 2009, 49(10): 1733-6. [53] 赵宇波. 关于图书馆古籍整理人才培养问题的思考[j]. 科技视界, 2017(22). [54] 王国强. 图书馆古籍整理人才培养问题的思考[j]. 山东图书馆学刊, 2011(5):11-13. [55] 高娟, 刘家真. 中国大陆地区古籍数字化问题及对策[j]. 中国图书馆学报, 2013, 39(4):110-119. [56] 厉莉. 古籍数字化的现状及对策[j]. 江西图书馆学刊, 2002, 32(1). [57] 范佳. “数字人文”内涵与古籍数字化的深度开发[j]. 图书馆学研究, 2013(3):29-32. [58] 于亭. 计算机与古籍整理研究手段现代化[j]. 古汉语研究, 2000, 3: 66-70. [59] 朱小健. 古籍整理通用系统及其中字典的编纂[j]. 语言文字应用, 2000(3):99-103. [60] 杜正民. 佛學數位資源的建置與開展[J]. 法鼓佛學學報, 2012(10):147-210 [61] 胡佳佳. 古籍数字化中基于关系的 xml 数据库[j]. 农业图书情报学刊, 2010, 22(2). [62] 马创新, 陈小何. 基于本体和 xml 的注疏文献的结构化知识表示[j]. 图书馆杂志, 2017, 36(8):62-68. [63] 邵正坤. 古籍数字化的困局及应对策略[j]. 图书馆学研究, 2014(12):32-34. [64] 何忠礼. 略论历史上的避讳[j]. 浙江大学学报(人文社会科学版), 2002, 32(1):82. [65] 李致忠. 古书版本鉴定. 北京: 文物出版社, 1997. [66] 魏隐儒 , 王金雨. 古籍版本鉴定丛谈. 北京: 中国社会科学出版社, 2017. [67] 蓝永.论古籍整理的新方式——古籍数字化[d].山东:山东大学,2007. [68] 陈明, 丁晓青, 梁健. 复杂中文报纸的版面分析、理解和重构[j]. 清华大学学报(自然科学版), 2001, 41(1). [69] 靳从. 中文版面分析关键技术的研究[d]. 南京理工大学, 2007. [70] 丁晓青, 王言伟. 文字识别: 原理、方法和实践 principles, methods and practice. 北京: 清华大学出版社. 2017. [71] 姜哲, 马少平, 夏莹. 大型中文古籍《四库全书》自动版面分析系统[j]. 中文信息学报, 2000, 14(2):14-20. [72] 南京图书馆. 中國古籍善本書目索引. 上海: 上海古籍出版社. 2009. [73] 李明杰. 数字环境下古籍整理范式的传承与拓新[j]. 中国图书馆学报, 2015, 41(5):99-110. [74] 李元祥, 丁晓青, 刘长松. 一种基于噪声信道模型的汉字识别后处理新方法[j]. 清华大学学报：自然科学版, 2001(1):24-28. [75] 陳曦.中國文學領域古籍整理工作之研究. [d].國立中興大學圖書資訊學研究所.2017.https://hdl.handle.net/11296/h89724 [76] 宋子然, 刘兴均. 中国古书校读法. 第 3 版. 成都: 巴蜀书社. 2004. ﹀
公开日期：	2019-06-15

公文辅助阅读平台的设计与实现.何寒松

链接

题名：	公文辅助阅读平台的设计与实现
姓名：	何寒松
学号：	1601210540
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2019-05-29
关键词：	公文自然语言处理辅助阅读
论文摘要：	︿随着互联网的发展，公文已不仅用于政府部门内部传达信息，还开始融入人们的日常生活中，越来越多人通过“学习强国”APP、各类新闻APP的政治板块以及各地政府官网阅读和学习公文。高效的公文阅读不仅能够使公务员更快、更有效地进行公务的上传下达活动，还能使大众快速学习和领会到公文所传达的信息。然而公文的用语并不是人们日常生活中常常使用的口语，而是政府部门日常办公使用的书面语，其特有的准确性、简要性、平实性和规范性等特点导致公文文本的遣词造句都非常的严谨，这样一来公文对读者的教育文化程度和阅读理解能力都有相应的要求。为了让大多数人更容易读懂公文，需要建立有效的公文阅读平台来辅助读者阅读和理解公文。笔者调研发现，公文阅读有以下两个问题没有得到有效解决：一是读者不易从公文长句中提取关键信息。普通人日常生活中最常用的句子的长度在2~15字左右，而公文句子的平均长度为40~50个汉字，主要是因为公文句子包含的语块较多，例如“社会主义核心价值观”、“两学一做”、“第十九次全国代表大会”等，而且公文中存在包含过多子句的复句。二是当前公文的展示没有突出公文的重点信息。目前公文的电子读物和纸质刊物都以公文的原始文本进行展示，这样的展示方式无法体现公文语言的特色，也不利于读者快速阅读和理解公文。为解决上述问题，首先笔者从北京大学的规划智库网站上爬取了四千多万字的公文语料作为本文文本处理工作的研究对象，然后结合自然语言处理技术分析公文的文本特征，通过提取公文句子的关键信息和优化公文文章的展示效果，设计并实现了一个公文辅助阅读平台，从以下两个方面来辅助读者阅读和理解公文。一方面，为了提取公文句子的关键信息，本文需要识别出公文句子的语块和主干成分。此外，由于公文句子长度过长，如果用词作为句子的组成单位，读者需要将组成句子的每个词逐个输入脑海组成语块再组成句子，这样会增加读者的短时工作记忆容量，因此本文使用语块来作为句子的组成单位，分别从公文的多字词、组织机构名、四字格短语、文件名和新概念入手，使用规则和统计的方法将这几类语块从句子中抽取出来，其中多字词识别的准确率达到了92.91%、组织机构名识别的准确率为82.88%，组织机构名中的会议名称识别的准确率为87%；公文中复句较多，而且复句常常包含多个子句，因此本文首先将公文的复句拆解为多个子句；同时，本文使用最大熵模型构建句子主干识别模型来识别公文的句子主干，其中例文的主谓宾三个主干成分的准确率都在90%以上。另一方面，本文还利用上述自然语言处理方法分析出的结果对公文进行长句的优化展示研究，以帮助读者更快速有效的理解复杂公文长句。本文首先使用Flask搭建公文辅助阅读平台，将识别出的语块做标记展示，然后通过对段间、句间、子句间做不同的间隔距离处理，来优化公文的展示效果，最后通过抽取文章的章节标题并设计文章的侧边导航，让读者能够快速获得整篇公文的主题脉络，同时提高文章内信息检索的效率。在本研究的最后，笔者邀请了10位同学完成本文设计的公文阅读理解任务，对比使用公文辅助阅读平台前后的测验成绩结果，结果表明本平台确实对公文阅读起到了一定的帮助。本文研究和实现的公文辅助阅读平台具有一定的实用价值和研究价值。对于公文读者来说，本文设计的平台有利于提高公文的阅读效率；对于将来从事公文阅读优化工作的工作者来说，自动标记公文的语块和主干成分有利于减少人工标记的工作量。本文对公文语体在技术上的探索和研究，能为将来使用自然语言处理技术研究公文的研究者提供一定的借鉴，也能为其他领域文章的辅助阅读研究做参考。﹀
分类号：	TP3
论文总页数：	84
参考文献总数：	40
参考文献列表：	︿狄颖. 中文多词表达抽取研究[D].南京师范大学,2013. 何先友,莫雷.国外文章标记效应研究综述[J].心理学动态,2000(03):36-42. 何先友,莫雷.文章主题的组织方式对文章标记效应的影响[J].心理发展与教育,2000(03):25-29. 胡玉溪. 基于双语语料的汉语多词表达抽取[D].北京邮电大学,2011. 黄自然. 以“字”为单位的汉语平均句长与句长分布研究[J].齐齐哈尔大学学报(哲学社会科学版),2018(01):133-138. 贾光茂,杜英.汉语“语块”的结构与功能研究[J].暨南大学华文学院学报,2008(02):64-70. 金勇.公文四字格短语运用浅析[J].中国西部科技,2008(18):87-88. 李航.统计学习方法[M].北京:清华大学出版社,2012:80-81. 李音, 戴卫平.语块理论与语块教学[J].现代语文：下旬．语言研究,2012,(12):22-24. 刘森,刘莎,王中雨.从理论角度分析影响阅读的因素[J].边疆经济与文化,2011(03):138-139. 刘婷,詹宏伟.陌生语块突显对其附带习得及文本理解的影响[J].湖北第二师范学院学报,2011,28(03):29-33. 刘子群. 书籍版式设计中字体排版的应用研究[D].江南大学,2007. 陆丙甫,蔡振光.“组块”与语言结构难度[J].世界汉语教学,2009,23(01):3-16. 吕英. 文章结构标记、呈现方式对学生认知负荷的影响[D].河南大学,2008. 吕子静.浅谈公文语言的特点与要求[J].陕西青年管理干部学院学报,2006(03):38-40 毛奇,连乐新,周文翠,袁春风.基于标点符号分割的汉语句法分析算法[J].中文信息学报,2007(02):29-34. 缪苗. VNC结构多词表达的抽取与分类[D].北京邮电大学,2011. 亓文香.语块理论在对外汉语教学中的应用[J].语言教学与研究,2008(04):54-61. 孙鹏程.基于语料库的现代行政公文句式考察[J].皖西学院学报,2018,34(01):130-134. 孙启高. 基于语料库的公文缩略语知识挖掘研究[D].山东大学,2014. 谭文堂. 基于统计模型的汉语句子主干分析[D].国防科学技术大学,2008. 汪春红. 汉语并列关系复句中的决策式依存句法分析与研究[D].华中师范大学,2016. 王蕾. 基于统计方法的汉语长句依存句法分析[D].中国海洋大学,2009. 徐润华,陈小荷,李斌.分词语料库中的并列式四字格识别[J].计算机工程与应用,2010,46(04):139-141. 徐润华,曲维光,陈小荷,王东波.多语料库中汉语四字格的切分和识别研究[J].中文信息学报,2013,27(05):15-21+42. 阎国利,白学军.中文阅读过程的眼动研究[J].心理学动态,2000(03):19-22. 杨振鹏. 中文多词表达抽取及其在依存句法分析中的应用[D].南京师范大学,2015. 叶起昌.超链接的导航和语义功能——解读超链接文本不可缺失的环节[J].北京交通大学学报(社会科学版),2007(03):103-107. 张硕. 基于语料库的2012年度党政机关公文词频分析[D].暨南大学,2013. 张小衡,王玲玲.中文机构名称的识别与分析[J].中文信息学报,1997(04):22-33. 张艳丽. 中文机构名称的自动识别[D].大连理工大学,2003. 赵国俊.电子政务教程[M].北京:中国人民大学出版社,2004: 62. 郑雪莹. 不同文本突显方式对于泛读中词汇附带习得的影响[D].上海师范大学,2015. 宗成庆.统计自然语言处理[M].北京:清华大学出版社,2013:23-26,122,125. 邹小阳.公文语体中的新四字格探析[J].湖南科技学院学报,2008(05):202-204. Finkel JR, Grenager T, and Manning CD. 2005. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling, In: Proceedings of ACL, pages 363-370. Leaman R and Gonzalez G. 2008. BANNER: An Executable Survey of Advance in Biomedical Named Entity Recognition. Pacific Symposium on Biocomputing, 13:652-663. Nadine Marcus, Martin Cooper& John Sweller. Understanding Instructions Journal of Educational Psychology, 1996, 88(1):49-63. TinkerMA Recent studies of eye movement sinreading Psychological Bulletin, 1958, 58 (4) :2 15 - 2 31 Wray, A．Formulaic Language and the Lexicon[M]. Cambridge:Cambridge University press,2002. ﹀
公开日期：	2019-06-04

多功能古籍协同研究平台的研究与设计.邓娟

链接

题名：	多功能古籍协同研究平台的研究与设计
姓名：	邓娟
学号：	1601210497
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2019-05-29
关键词：	古籍研究古籍整理协同研究
论文摘要：	︿古籍文献是古籍专家必备的研究资料，也是文化传播的重要载体。传统的古籍研究以纸质资源为载体，受时间、空间等条件的限制比较多，存在着资源受限、研究效率不高等问题；而目前的古籍数字化平台在功能、内容呈现模式等方面存在不足，难以发挥实际作用，要提高研究效率、传播传统文化，需建立有效的研究与阅读平台。笔者调研发现，以下四个问题暂未得到有效解决。一、对读者来说，阅读古代典籍时存在着字词、句子篇章和古代常识的理解障碍，而现有的古籍平台在提供古籍资源时并未解决此问题，不便于传统文化的传播。二、现有的古籍数字化平台未能充分挖掘文献内容，缺乏文献之间的关系，比如文献之间的引用关系、文献不同版本之间的关系；更进一步，古籍文献中的人名、地名等专有名词，未与相关材料形成知识网络，比如缺乏人物与其别名、官职、作品、生平资料等内容的联系。三、未能充分支持研究者之间的交流与合作，研究者的研究想法与研究成果难以共享。四、缺乏研究支持工具，比如相似句段发现、对比等。为解决上述问题，本研究以古籍整理研究理论和方法为依据，结合古籍文献特点，利用数字化平台的优势，设计了一款多功能古籍协同研究平台，从以下四个方面优化平台的辅助作用。一、提供字典、词典及参考资料，建立人物传记资料、地理资料等知识库辅助阅读，并提供注释、翻译等内容解决阅读障碍。二、根据古籍文献特点，提供专名标注、引书挖掘等数据加工工具，在计算机辅助的情况下，建立文献知识之间的联系，形成知识网络；三、首先以“研究项目”的资源组织形式，构建研究课题、研究文献、研究者之间的联系；然后，通过在线批注与编辑功能，实现在线交流与协作研究；四、集成辅助工具支持研究工作，提高研究效率，如相似句段、文字对比、典籍引用、引用挖掘等。最后，本研究以邀请古籍专家试用的方式，以古籍研究实例验证了设计方案的有效性和合理性。验证结果表明本文提供的协同研究环境与文献标注等工具有效辅助研究者开展研究工作，提高研究效率。﹀
外文摘要：	︿ Ancient books are essential research materials for experts and important carriers of cultural transmission. Traditional research resources are mainly paper materials, which are limited by time and space, thus experts are confronted with limited research materials and low research efficiency. In order to solve these problems, digital platforms appear and play significant roles in ancient books research, but many shortcomings exist in these systems in terms of functions and content presentation. In this context, this paper aims to design a platform providing experts in ancient books research with a collaborative online research environment, and ordinary readers with a user-friendly reading environment. By studying existing platforms and experts, the author finds out that the following four problems have not been effectively resolved. First, language barriers and lack of ancient common sense are major obstacles to read ancient texts for modern Chinese people. However, current platforms have not solved these problems when providing resources and functions, so they cannot popularize traditional culture with significant effect. Second, the existing platforms fail to fully exploit the content of the literature or link cited documents or different versions of the literature. Furthermore, these platforms are not able to form a knowledge network to connect the proper names such as the names of people and places in the ancient books within related contents. Third, due to the lack of collaborative online research environment, researchers are unable to share research ideas and results with other people. Fourth, there are no research support tools such as similar segments and citation marks. In order to solve the above problems, based on the theory and methods of ancient books research, combining the characteristics of ancient works, and taking advantages of digitalization, this study designs a multifunctional platform for ancient books collaborative research, which plays an auxiliary role in the following four aspects. First, the platform establishes a knowledge base of biographical and geographic materials to help reading and provides annotations, translation and other content to break down reading barriers. Second, considering the characteristics of ancient books, this paper provides data processing tools such as name tagging and quotation mining, and links related contents to establish a knowledge network. Third, the study organizes research topics, literature and researchers by “research project”, and then establishes a communication and collaborative platform with online annotation and editing functions. Fourth, the online system integrates similar segments, text comparison, automatic annotation and other tools to assist research work, improving research efficiency. At last, this paper verifies the validity and feasibility of the design by user testing and case study. According to the feedback of experts in ancient books research, this design complements limited functions of current platforms, helps experts carry out research work and promotes ancient books to the public. ﹀
分类号：	TP3
论文总页数：	83
参考文献总数：	86
参考文献列表：	︿ [1] 童志斌. 关于普通高中语文教材中文言课文编排与教学的思考——7种版本高中语文新课标必修教材比较[J]. 教育理论与实践, 2009(17):17-19. [2] 王志凯. 中学文言文教学目的研究[硕士学位论文]. 浙江: 浙江师范大学, 2002. [3] 朱易安. 古籍整理研究面临的困难和出路之我见[J]. 古籍整理研究学刊, 1999(6):3-4. [4] 袁林. 中国古代史研究数字化文献资源与利用[J]. 中国史研究动态, 2000(12):19-27. [5] 王纯. 古籍数字化之趋势[J]. 济宁学院学报, 2000(3):50-51. [6] 李运富. 谈古籍电子版的保真原则和整理原则[J]. 古籍整理研究学刊, 2000(1):1-7. [7] 彭江岸. 论古籍的数字化[J]. 河南图书馆学刊, 2000, 20(2):63-65. [8] 乔红霞. 关于古籍全文数据库建设工作的思考[J]. 河南图书馆学刊, 2001, 21(4):58-60. [9] 毛建军. 古籍数字化的概念与内涵[J]. 图书馆理论与实践, 2007(4). [10] 毛建军. 古籍数字化理论与实践[M]. 北京: 航空工业出版社, 2009. [11] 高娟, 刘家真. 中国大陆地区古籍数字化问题及对策[J]. 中国图书馆学报, 2013, 39(4):110-119. [12] 刘家真. 馆藏文献数字化的原则与方法(下)[J]. 中国图书馆学报, 2001, 27(6). [13] 牛红广. 关于古籍数字化性质及开发的思考[J]. 图书馆, 2014(2). [14] 王立清. 关于多元古籍数字化主体的探讨[J]. 图书馆学研究, 2011(7):53-58. [15] 吴夏平. 古籍数字化与学术研究[J]. 贵州师范学院学报, 2007, 23(6):69-72. [16] 陈诚．论古典文献数字化[硕士学位论文]. 苏州: 苏州大学, 2004． [17] 胡石, 肖莉杰. 新媒体环境下的古籍阅读模式研究[J]. 图书馆学研究, 2012(19):78-81. [18] 曹林娣. 古籍整理概论[M]. 北京: 北京大学出版社, 2007. [19] 王玉良. 略谈我国古代文字的载体及书籍的起源[J]. 中国图书馆学报, 1993(2):78-82+98. [20] 黄永年. 古籍整理概论[M]. 上海: 上海书店出版社, 2013. [21] 许威汉. 训诂学读本[M]. 上海交通大学出版社, 2010. [22] 徐春波.中医古籍文献的多层次结构探析[C].//医论集锦.山东中医药大学,2005:86-91. [23] 马创新, 陈小荷. 基于本体和XML的注疏文献的结构化知识表示[J]. 图书馆杂志, 2017, 36(8):62-68. [24] 刘琳, 吴洪泽. 古籍整理学[M]. 四川: 四川大学出版社, 2003. [25] 李明杰. 数字环境下古籍整理范式的传承与拓新[J]. 中国图书馆学报, 2015, 41(5):99-110. [26] 汪耀楠. 注释学纲要[M]. 北京: 语文出版社, 1991. [27] 时永乐. 古籍整理教程[M]. 河北: 河北大学出版社, 2003. [29] 潘树广. 论古代文学研究中的文献学方法[J]. 常熟高专学报, 1999(1):58-62+75. [30] 管锡华. 论注释与训诂和古籍整理研究的关系[J]. 合肥师范学院学报, 1994(2):58-62. [31] 纪健生. 厚积薄发金针度人——读吴孟复《古籍研究整理通论》[J].古籍研究, 1998(2):104-111. [32] 姜亮夫. 整理与研究异同辨——有关古籍整理研究若干问题之一[J]. 文史哲, 1984(6):81-85. [33] 徐国庆.现代汉语词汇系统论[M]. 北京: 北京大学版社，1999年，第184-187页. [34] 陆宗达, 王宁. 训诂与训诂学[M]. 山西: 山西教育出版社, 1994. [35] 崔文印. 关于古籍整理的一些问题[J]. 史学史研究, 1985(1):21-28. [36] 吕叔湘. 南北朝人名与佛教[J]. 中国语文,1988年第4期． [37] 辛志贤. 《左传》地名考辨[J]. 北京师范大学学报：社会科学版, 1996(3):20-27. [38] 徐志明. 程甲本《红楼梦》回目之人物称呼统计研究[J]. 剑南文学(下半月), 2011(11):75-76. [39] 何凌霞. 《三国志》专名研究[博士论文]. 复旦大学, 2009. [40] 胡道静. 叶廷珪和《海录碎事》[J]. 辞书研究, 1990, 1990(1):107-115. [41] 王映予. 宋代类书《海录碎事》研究[博士学位论文]. 兰州: 兰州大学, 2017. [42] 彭婵娟. 《玉海·艺文》所引宋代文献研究[硕士学位论文]. 广西: 广西师范大学, 2016. [43] 衡中青. 地方志知识组织及内容挖掘研究——以《方志物产·广东》为例[博士学位论文]. 江苏: 南京农业大学, 2007. [44] 张明. 刘孝标《世说新语注》引书研究——经、子、集三部[博士学位论文]. 吉林: 东北师范大学, 2009. [45] 李文娟. 《太平御览》引《论语》考[硕士学位论文]. 山东:曲阜师范大学, 2014. [46] 刘跃进. 《玉海·艺文》的特色及其价值[J]. 复旦学报（社会科学版）, 2009(4):38-42. [47] 赵逵夫. 校读法的概念、范围与条件[J]. 古籍整理研究学刊, 2007(3):1-4. [48] 宋子然. 中国古书校读法[M]. 四川: 巴蜀书社, 1995. [49] 时永乐, 门凤超. 古籍版本学的研究内容[J]. 图书馆理论与实践, 2008(4). [50] 梁岳标. 郑藏本的渊源与流变[C]. 红楼梦研究//红迷会仪征分会. 2018:19-36. [51] 陈爱志. 数字化古籍对古籍整理与研究的影响[J]. 中华医学图书情报杂志, 2011, 20(1):18-20. [52] Lunin L F , Rada R . Perspectives on. Hypertext: Introduction and Overview.[J]. Journal of the American Society for Information Science, 1989, 40. [53] Destefano D , Lefevre J A . Cognitive load in hypertext reading: A review[J]. Computers in Human Behavior, 2007, 23(3):1616-1641. [54] 常娥. 古籍智能处理技术研究——农业古籍自动编纂和自动校勘的研究[博士学位论文]. 江苏:南京农业大学, 2007. [55] Scheiter K , Gerjets P . Learner Control in Hypermedia Environments[J]. Educational Psychology Review, 2007, 19(3):285-307. [56] Greif I. Computer-supported cooperative work : a book of readings[J]. Communications of the Acm, 1988:0180. [57] 史美林. 计算机支持的协同工作理论与应用[M]. 北京: 电子工业出版社, 2000. [58] 颜运梅. 众包在国内古籍数据库建设中的应用研究[J]. 图书馆研究, 2016(5):30-34. [59] 辛睿龙, 王雅坤. 古籍数字化中汉字处理的现状、问题及策略[J]. 图书馆理论与实践, 2017(9):103-107. [60] Kittur A, Nickerson J V, Bernstein M, et al. The future of crowd work[J]. Social Science Electronic Publishing, 2013, 263(1):1301-1318. [61] 马创新, 陈小荷, 曲维光. 注疏文献中的注释语句自动分析[J]. 计算机科学, 2012, 39(10):220-223. [62] 马创新, 陈小荷, 曲维光. 经典古籍注疏文献的知识网络研究与设计[J]. 图书情报工作, 2013(9):124-128. [63] 白振田, 衡中青, 侯汉清. 地方志引书挖掘系统的设计与实现[J]. 图书馆杂志, 2008, 27(8):50-54. [64] 汤亚芬. 先秦古汉语典籍中的人名自动识别研究[J]. 数据分析与知识发现, 2013, 29(7/8):63-68. [65] 朱锁玲. 命名实体识别在方志内容挖掘中的应用研究——以广东、福建、台湾三省《方志物产》为例[博士学位论文]. 江苏: 南京农业大学, 2011. [66] 朱积孝. 试谈古籍的情报价值与开发利用[J]. 图书馆工作与研究, 1995(4):12-15. [67] 郑永晓. 古籍数字化与古典文学研究的未来[J]. 文学遗产, 2005(5):130-137. [68] 常娥, 黄建年, 侯汉清. 古籍智能整理与开发系统构建研究[J]. 情报资料工作, 2009(4):43-47. [69] 赵新. 从《儒藏》精华编看古籍数字化的价值理念与技术前景[J]. 现代出版, 2016(2):31-33. [70] 朱小健. 古籍整理通用系统及其中字典的编纂[J]. 语言文字应用, 2000(3):99-103. [71] 陈力.中文古籍数字化方法之检讨[J].国家图书馆学刊,2005,14(3):11-16. DOI:10.3969/j.issn.1009-3125.2005.03.003. [72] 汪毅夫. 《台海击钵吟集》史实丛谈——兼谈台湾文学古籍研究的学术分工[J]. 福建师范大学学报(哲学社会科学版), 2007(1):47-52. [73] 于亭. 计算机与古籍整理研究手段现代化[J]. 古汉语研究, 2000(3):66-70. [74] 王兆鹏. 古籍文献的检索工具书概述[J]. 古典文学知识, 2003(2):98-107. [75] 曹书杰. 古籍中人物字号、别名的查考[J]. 古籍整理研究学刊, 1989(4):46-49. [76] 史睿. 论中国古籍的数字化与人文学术研究[J]. 国家图书馆学刊, 1999(2):28-35. [77]于亭. 略谈计算机古籍资料库建设[J]. 古籍整理研究学刊, 1999(6):11-12. [78] 童强. 从注疏之学看唐代学术思想的发展[J]. 江海学刊, 2002(4). [79] 李亦茹. 试论清代注疏体文献的检索功能[J]. 图书馆理论与实践, 2004(6). [80] 葛志毅. 史学方法论与传统考据学[J]. 学习与探索, 1990(1):123-132. [81] 杨士首. 古汉语同实异名现象的产生[J]. 辽宁大学学报(哲学社会科学版), 1991(5):72-74. [82] 尚永亮. 数据库、计量分析与古代文学研究的现代化进程[J]. 文学评论, 2007(6):187-190. [83] 施吕彦. 《现代计量学概论》, 北京: 中国计量出版社, 2003. [84] Michelk J B , Shen Y K , Aiden A P , et al. Quantitative Analysis of Culture Using Millions of Digitized Books[J]. Science, 2010, 331(6014):176-182. [85] Torget A J , Mihalcea R , Christensen J , et al. Mapping Texts: Combining Text-Mining and Geo- Visualization To Unlock The Research Potential of Historical Newspapers[J]. Unt Scholarly Works, 2011. [86] JohnSinclair. Corpus, concordance, collocation = 语料库、检索与搭配[M]. 上海: 上海外语教育出版社, 1999. ﹀
公开日期：	2019-06-17

2019-05-27

大学英语写作学习平台游戏化设计研究与实践.戴欣怡

链接

题名：	大学英语写作学习平台游戏化设计研究与实践
姓名：	戴欣怡
学号：	1601210495
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	信息科学技术学院
论文答辩日期：	2019-05-27
关键词：	大学英语写作游戏化学习产品设计
论文摘要：	︿随着国际化进程的不断发展，人们越来越重视外语能力的培养，尤其是在真实情境中的语言运用能力，其中英语写作能力占据了非常重要的位置。但由于国内学生的外语基础较为薄弱，能够实际操练的机会少，且外语写作水平的提升并非一蹴而就，因此写作成为了国内外语学生的弱项，同时也成为了他们焦虑和畏惧的对象。随着在线学习的日益普及和游戏化概念的推广，游戏化学习的方法越来越多地应用在各类在线英语学习系统中。但现有的在线写作学习系统仅停留在写作练习本身，没有考虑到学生在写作学习过程中的心理状态与变化，既无法对他们起到激励作用，也无法缓解他们的焦虑感。本文基于游戏化写作学习理论、动机激励理论和焦虑理论，分析了现有相关产品在游戏化设计上的优势与不足，针对目标学生群体的心理需求和现有在线学习游戏化设计存在的问题完成大学英语学习平台的游戏化设计。通过游戏、排名、奖励等机制提升学生在学习过程中的获得成就感与满足感，通过明确阶段性学习目标、营造轻松的学习氛围降低学生的焦虑感，通过多维度的学习行为记录与评价机制给予学生更为客观的评价。由于系统中涉及的游戏化元素和机制较多且错综复杂，本文从工作成果中选择了比较有代表性的课程地图、积分系统和排行榜三个游戏化模块进行了详细的研究和探讨，针对各个模块提出了不同的设计方案。此外，还简单介绍了本系统中勋章、任务系统以及基于游戏化的学生评价方法的设计思路与方案。基于以上工作，本研究选取了40 名非英语专业大学二年级学生进行对照教学实验，通过实验观察以及实验后的问卷调查、数据分析和访谈分别论证了课程地图、积分系统和排行榜模块不同的设计方案，包括帮助学生明确学习目标，给予他们正向激励和成就感，帮助外部动机内化提升学习积极性，以及树立学生的信心、降低写作学习过程中的焦虑感等方面的设计有效性，并针对若干疑难筛选出了各个模块的最佳设计方案。本研究中大学英语写作平台游戏化设计填补了在线英语写作游戏化学习方面的空白，有效地提升了学生在写作学习过程中的胜任感、自主感和归属感，缓解了其面对写作学习时恐惧和焦虑的情绪，同时提出了一种基于游戏化的学生评价机制，补足了现有英语游戏化学习设计中缺乏形成性评估机制的短板，对英语写作移动教学和游戏化学习具有一定的参考价值。﹀
外文摘要：	︿ with the continuous internationalization process, people pay more and more attention to the foreign language competence, especially the ability to use the language in real life, in which writing ability plays a crucial role. however, since writing skills could not be improved overnight, most chinese students, who have limited knowledge of english and lack the opportunity to apply it into practice, are frustrated when they write in english. with the growing popularity of online learning and the spread of gamification, gamification learning methods are increasingly used in various online english learning systems. unfortunately, existing online writing learning systems only focus on the writing practice itself but fail to take into account students' everchanging psychological state in the process of writing and learning. consequently, these systems can neither motivate learners nor relieve their anxiety. based on gamification writing learning theory, motivation theory and anxiety theory, this paper analyzes the advantages and disadvantages of existing competing products and its counterparts in terms of their gamification design, and aims to design an english learning platform which caters for the psychological needs of university students. learners' sense of achievement as well as their satisfaction are enhanced by games, ranking, rewards and other mechanisms in the learning process. meanwhile, by illustrating learning ives in various stages and creating a relaxing learning atmosphere, they would experience less anxiety. moreover, students would receive more ive evaluation which is based on multidimensional learning behavior. because of the complexity of numerous gamification elements and mechanisms involved in the system, this paper selects three representative component, namely the learning map, score systems and ranking list, for detailed research and discussion. different designs are proposed for each of them. in addition, design ideas and schemes of the medals, task systems and evaluation methods based on gamification are briefly introduced. based on the above work, this study selected 40 non-english majors in their second year for the comparative experiments. various designs for learning map, score systems and ranking list are analyzed with the help of experimental observation, post-experimental questionnaire survey, data analysis and interviews. the designs includes helping students to set clear learning goals, providing motivation as well as a sense of accomplishment, and guiding students to internalize motivations to increase their enthusiasm, as well as building their confidence while reducing their anxiety in writing and learning. the best design for each component is proposed. in this study, the gamification design of the college english writing platform fills the blank of the online english writing learning, which effectively enhances students' confidence as well as their learner autonomy and the sense of belonging in the process of learning how to write. it also alleviates their anxiety and fear during the learning. the evaluation mechanism which is based on gamification compensate the lack of formative assessment in the existing english gamification learning design. it has certain reference value for online english writing learning and gamification learning. ﹀
分类号：	TP3
论文总页数：	60
参考文献总数：	44
参考文献列表：	︿艾瑞咨询（iResearch）. 中国移动游戏行业研究报告[EB/OL]. [2018.07]. http://report.iresearch.cn/wx/report.aspx?id=3266. 鲍雪莹, 赵宇翔. 游戏化学习的研究进展及展望[J]. 电化教育研究, 2015, 36(8): 45-52. 贝晓越. 写作任务的练习效应和教师反馈对不同外语水平学生写作质量和流利度的影响[J]. 现代外语（季刊）, 2009, 32:4:389-398. 蔡慧萍，方琰. 英语写作教学现状调查与分析[J]. 外语与外语教学, 2006, 9:21-24. 蔡兰珍. “任务教学法”在大学英语写作中的应用[J]. 外语界, 2001, 4:6:41-46. 曹荣平, 张文霞, 周燕. 形成性评估在中国大学非英语专业英语写作教学中的运用[J]. 外语教学, 2004, 5:82-87. 池丽萍, 辛自强. 大学生学习动机的测量及其与自我效能感的关系[J]. 心理发展与教育, 2006, 22(2):64-70. 郭燕, 秦晓晴. 中国非英语专业大学生的外语写作焦虑测试报告及其对写作教学的启示［J］. 外语界, 2010, (2):54-62. 简·麦戈尼格尔. 游戏改变世界：游戏化如何让现实变得更美好[M]. 北京：北京联合出版公司, 2016. 凯文·韦巴赫, 丹·亨特. 游戏化思维：改变未来商业的新力量[M]. 浙江：浙江人民出版社, 2014. 兰良平, 韩刚. 英语写作教学——课堂互动性交流视角[M]. 北京：外语教学与研究出版社, 2014. 李航. 大学生英语写作焦虑和写作成绩的准因果关系：来自追踪研究的证据[J]. 外语界, 2015, (3):68-75. 李炯英, 李青. 我国外语焦虑研究：回顾与反思——基于外语类期刊近十年 (2006—2015) 论文的统计分析[J]. 外语界, 2016, 4:58-65. 刘梅华. 论低自信和课堂表现焦虑对大学生英语学习的影响：交叉滞后研究[J]. 外语教学, 2011, 32(5): 43-47. 唐丽洁. 国内十年游戏化学习研究现状与分析[J]. 中国教育信息化, 2015, (10):23-25. 王庆, 钮沐联, 陈洪, 等. 国内教育游戏研究发展综述[J]. 电化教育研究, 2012, (1): 82-84, 89. 文秋芳. 英语学习策略论[M]. 上海：上海外语教育出版社, 1996. 吴庆麟, 胡谊, 朱晓红. 教育心理学[M]. 上海：华东师范大学出版社, 2018. 徐杰, 杨文正, 李美林, 等. 国际游戏化学习研究热点透视及对我国的启示与借鉴——基于 Computers & Education (2013-2017) 载文分析[J]. 远程教育杂志, 2018, 36(6), 73-83. 中国音数协游戏工委（GPC）, CNG 中新游戏研究（伽马数据）, 国际数据公司（IDC）. 2018 中国游戏产业报告：摘要版[M]. 北京：中国书籍出版社, 2018. Black P., &William D. Inside the Black Box: Raising Standards Through Classroom Assessment[M]. Granada Learning, 2005. Black P, & William D. Developing the Theory of Formative Assessment [J]. Educational Assessment, Evaluation and Accountability, 2009, 21(1): 5-31. Bloom B S, Hastings J T, Madaus G F. Handbook on Formative and Summative Evaluation of Student Learning [M]. New York: MacGraw-Hill, 1971. Boud D. Assessment and learning: contradictory or complementary[J]. Assessment for learning in higher education. 1995:35-48. Broadfoot P, Daugherty R, Gardner J, et al. Assessment for Learning: Beyond the Black Box [M]. Cambridge, England: University of Cambridge School of Education, 1999. Buck G A, Trauth-Nare A E. Preparing teachers to make the formative assessment process integral to science teaching and learning[M]. Journal of Science Teacher Education, 2009, 20(5):475-494. Carless D. Learning-oriented Assessment: Conceptual Bases and Practical Implications [J]. Innovations in Education & Teaching International, 2007, 44(1): 57-66. Clarke-Midura J & Groff J. Formal Game-Based Assessments: The challenge and opportunity of building next generation assessments[C]. Madison: Games+Learning+Society 8.0, 2012. Cowie B, Bell B. A Model of Formative Assessment in Science Education [J]. Assessment in Education: Principles, Policy & Practice, 1999, 6(1): 101-116. Davison C, & Leung C. Current Issues in English Language Teacher-based Assessment [J]. TESOL Quarterly, 2009, (43): 393-415. Hattie J, Timperley H. The Power of Feedback [J]. Review of Educational Research, 2007, 77(1): 81-112. Horowitz D. Process not product: less than meets eyes [J]. TESOL Quarterly. 1986, 20(1):141-4. KAPP Karl M. Games, gamification, and the quest for learner engagement[J]. T+ D, 2012, 66(6):64-68. Kapp Karl M. The gamification of learning and instruction fieldbook: Ideas into practice [M]. John Wiley & Sons, 2013. Kim Yoon Jeon, Valerie J Shute. The interplay of game elements with psychometric qualities, learning, and enjoyment in game-based assessment[J]. Computers & Education, 2015, 87: 340-356. Malone T W. What makes things fun to learn? Heuristics for designing instructional computer games[C]. Proceedings of the 3rd ACM SIGSMALL symposium and the first SIGPC symposium on Small systems, ACM, 1980: 162-169. Nicholson S. A User-Centered Theoretical Framework for Meaningful Gamification [C]. Madison: Games+Learning+Society 8.0, 2012. Rea-Dickins P. Mirror, Mirror on the Wall: Identifying Processes of Classroom Assessment [J]. Language Testing, 2001, 18(4):429-462. Tsai Fu-Hsing, Chin-Chung Tsai, Kuen-Yi Lin. The evaluation of different gaming modes and feedback types on game-based formative assessment in an online learning environment[J]. Computers & Education, 2015, 81: 259-269. Ventura M, Shute V. The validity of a game-based assessment of persistence[J]. Computers in Human Behavior, 2013, 29(6): 2568-2572. Woodrow L. College English writing affect: Self-efficacy and anxiety [J]. System, 2011, (39):510-522. Wynne Harlen, Mary James. Assessment and Learning: differences and relationships between formative and summative assessment [J]. Assessment in Education: Principles, Policy & Practice, 1997, 4:3, 365-379 Young D J. Creating a low-anxiety classroom environment: What does language anxiety research suggest? [J]. The modern language journal. 1991:75(4):426-437. Y-S Cheng. A measure of second language writing anxiety: Scale development and preliminary validation [J]. Journal of Second Language Writing, 2004,13:313-335. Zichermann G, Cunningham C. Gamification by Design: Implementing Game Mechanics in Web and Mobile Apps [M]. New York: O’Reilly Media Inc., 2011. ﹀
公开日期：	2019-06-14

中文文本分析量化指标体系的研究与应用.杨雨萌

链接

题名：	中文文本分析量化指标体系的研究与应用
作者：	杨雨萌
学号：	1601210811
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2019-05-27
题目(外文)：	Research and Application of Quantitative Indices System for Chinese Textual Analysis
关键字(中文)：	中文文本分析语言特征量化指标自然语言处理
关键字(外文)：	Chinese textual analysis Linguistic feature Quantitative indices Natural language processing
文摘：	︿文本特征的自动量化分析是通过计算机程序实现文本特征的定量评估。文本量化的一大核心是建立一组反映文本特征的指标体系。相较于人工分析，定量分析文本特征更加客观和高效。因此，在西方它已被应用在字母类语言的话语分析、语料库研究等领域。目前，文本特征量化指标的研究多以英文文本为对象，中文文本分析量化指标研究较少。从现有研究来看，中文文本分析主要存在三个问题。第一，现有的中文文本量化指标体系化不足，研究较为单一且不够全面；第二，缺少一个中文文本自动量化分析系统；第三，中文文本量化分析指标体系和量化分析系统的应用价值亟待研究和证明。基于上述中文文本分析量化指标研究中存在的问题和不足，本研究围绕现代汉语，确立了文本分析量化指标研究、自动量化分析工具的实现、量化指标的应用三大研究主线。第一，在英文文本分析量化指标的研究基础上，以汉语特点和汉语语法为纲，建立了适用于中文文本分析的量化指标体系；第二，以中文文本量化指标体系为基础，依托于自然语言处理技术与中文语言资源，设计并实现了一款中文文本量化分析工具；第三，将中文文本量化分析工具应用于现代汉语文本特征研究和文本分级模型。本研究建立的中文文本分析量化指标体系聚焦于文本的语言特征，总共包括五个层面：描述性特征层面、汉字层面、词汇层面、句子层面和语篇层面。描述性特征层面包括12个指标；汉字层面包括32个指标；词汇层面包括67个指标；句子层面包括60个指标；语篇层面包括1个指标。整个文本量化指标体系包含的指标共计170余项。本研究以两个应用为例，阐述了中文文本分析量化指标体系和量化分析系统的实用价值。第一，本研究以人教版小学语文教科书课文为例，从汉字、词汇和句子等五个层面，对语料进行了较为全面地统计分析。第二，本研究基于机器学习算法和量化指标体系，构建了文本分级模型，模型的预测准确度高达0.90左右。研究数据结果显示，随着年级的上升，小学教科书课文的字词量、汉字复杂度、词汇难度和句法复杂度等特征值均呈现上升态势，基本遵循了从简到难的编排特点。然而，课文的用字用词仍存在改进之处。例如低年级课文中出现了较多的非常用字词；部首表收录内容与课文用字的关联度较弱等。这些研究结果已被应用于北京大学俞敬松老师研究小组的相关教学研究中。此外，本研究构建的文本分级模型能参照标准教科书，预测文本的阅读级别，从而被应用于不同阅读级别文本的自动分类和筛选。﹀
文摘（外文）：	︿ automated textual analysis is to analyze text features quantitatively with computer programs. how to build a group of indicators that can reflect text characteristics is one of the core issues of textual analysis. compared with manual analysis of text, quantitative analysis of text is ive and efficient. therefore, it has been applied in the discourse analysis and the research of corpus of alphabetic languages in the western world. currently, most of the studies in automated textual analysis focus on english texts, and the studies in chinese textual analysis are rare. three major shortages can be found in the existing studies in chinese textual analysis. firstly, the existing studies focusing on textual features do not take a systematic and comprehensive approach. secondly, there is no tool available for analyzing chinese texts. thirdly, the practical value of quantitative indices system and the automated tool is still to be researched and validated. to solve the problems of the previous studies related to quantitative indices system mentioned above, and fill the research gap, this research mainly focuses on three issues related to the analysis of modern chinese texts. firstly, in this research, a quantitative indices system is established for chinese textual analysis based on chinese characteristics and chinese grammar. secondly, a tool for the automated analysis of chinese texts based on the indices system is designed and built with the support of natural language processing technology and chinese language resources. furthermore, the tool is used in the analysis of modern chinese texts and the establishment of text leveling models. the quantitative indices system for chinese textual analysis mainly focuses on linguistic features. this system consists of five levels: deive indices, chinese character, words, sentence, and discourse, with 12, 32, 67, 60, and 1 indicator for each level, respectively. in total, the indices system has more than 170 indicators. furthermore, in this research, two applications are used as examples to prove the practical value of quantitative indices system and the automated tool. in the first application, the linguistic features of primary school textbooks, published by the people’s education press, are extracted. the textbooks are analyzed thoroughly on the five levels, including chinese character, words, and sentence. in the second application, the quantitative indices system is integrated with machine learning algorithms, and text leveling models for chinese texts are built. the prediction accuracy of text leveling models is about 0.90. according to the results, many linguistic values related to the features of textbooks such as word count, chinese character complexity, vocabulary level, and syntactic complexity, show an upward trend as the year increases, indicating that in general the texts are simpler for students in lower years, and are more complex for students in higher years. however, there is still room for improvements regarding the use of characters and words. for example, there are lots of uncommon words in the textbooks for students in lower years. besides, the content of the radical table has little connection with the chinese characters used in textbooks. these findings have been used to support the relevant research by jingsong yu research team of peking university. furthermore, the text leveling models can be used to predict the reading level of chinese texts with the reference to the levels of standard textbooks, and therefore, these models can be used for automated classification and selection of chinese reading texts. ﹀
分类号：	TP3
论文总页数：	147
参考文献数：	86
参考文献：	︿ [1] Freud S. The Psychopathology of Everyday Life [M]. New York: W.W. Norton & Company, 1989: 50. [2] 王力. 汉语语法纲要[M]. 上海: 上海教育出版社, 1982. [3] 吴思远,蔡建永,于东, 等. 文本可读性的自动分析研究综述[J]. 中文信息学报, 2018, 32(12):1-10. [4] Kincaid J P, Fishburne Jr R P, Rogers R L, et al. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) For Navy Enlisted Personnel[R]. Institute for Simulation and Training, 1975: 56. [5] Gunning R. The Technique of Clear Writing [M]. New York: McGraw-Hill, 1952. [6] Caylor J S, Sticht T G, Fox L C, et al. Methodologies for Determining Reading Requirements of Military Occupational Specialties [J]. Human Resources Research Organization, 1973: 81. [7] 张宁志. 汉语教材语料难度的定量分析[J]. 世界汉语教学, 2000, 3: 83-88. [8] 王蕾. 初中级日韩留学生文本可读性公式初探[硕士学位论文]. 北京: 北京语言大学, 2005. [9] 郭望皓. 对外汉语文本易读性公式研究[硕士学位论文]. 上海: 上海交通大学, 2010. [10] 孙刚. 基于线性回归的中文文本可读性预测方法研究[硕士学位论文]. 南京: 南京大学, 2015. [11] 荆溪昱. 中文国文教材的适读性研究: 适读年级值的推估[J]. 教育研究资讯, 1995, 3(3): 113-127. [12] 杨金余. 高级汉语精读教材语言难度测定研究[硕士学位论文]. 北京: 北京大学, 2008. [13] 左虹,朱勇. 中级欧美留学生汉语文本可读性公式研究[J]. 世界汉语教学, 2014, 28(2): 263-276. [14] 杨孝溁. 实用中文报纸可读性公式[J]. 新闻学研究, 1974, 13:37-62. [15] 张必隐, 孙汉银．中文易懂性公式[A]. //北京师范大学. 中美教育问题研讨会论文集[C]. 1992: 246-249. [16] 罗素华. 汉语中级泛读教材难度定量分析[硕士学位论文]. 长沙: 湖南师范大学, 2015. [17] Bååth R. ChildFreq: An Online Tool to Explore Word Frequencies in Child Language [J]. Lucs Minor, 2010, 16: 1-6. [18] Marsden E, Myles F, Rule S, et al. Using Childes Tools for Researching Second Language Acquisition [J]. British Studies in Applied Linguistics, 2003, 18: 98-113. [19] Tausczik Y R, Pennebaker J W. The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods [J]. Journal of Language and Social Psychology, 2010, 29(1): 24-54. [20] Rude S, Gortner E M, Pennebaker J. Language Use of Depressed and Depression-Vulnerable College Students [J]. Cognition & Emotion, 2004, 18(8): 1121-1133. [21] Newman M L, Pennebaker J W, Berry D S, et al. Lying Words: Predicting Deception from Linguistic Styles [J]. Personality and Social Psychology Bulletin, 2003, 29(5): 665-675. [22] 黄金兰, 林以正, 谢亦泰, 等. 中文版语文探索与字词计算词典之建立[J]. 中华心理学刊, 2012, 54(02):185-201. [23] 张信勇. LIWC: 一种基于语词计量的文本分析工具[J]. 西南民族大学学报: 人文社会科学版, 2015, 36(4): 101-104. [24] Sung Y T, Chang T H, Lin W C, et al. CRIE: An Automated Analyzer for Chinese Texts [J]. Behavior Research Methods, 2016, 48(4): 1238-1251. [25] Graesser A C, McNamara D S, Kulikowich J M. Coh-Metrix: Providing Multilevel Analyses of Text Characteristics [J]. Educational Researcher, 2011, 40(5): 223-234. [26] McNamara D S, Graesser A C, McCarthy P M, et al. Automated Evaluation of Text and Discourse with Coh-Metrix [M]. Cambridge: Cambridge University Press, 2014. [27] 江进林. Coh-Metrix工具在外语教学与研究中的应用[J]. 中国外语, 2016, 13(5): 58-65. [28] 张琇涵, 倪雅真, 廖晨惠, 等. 实词笔、一词多义、笔画数指标建置中文文本自动化分析系统[C]. 第八届信息科技国际研讨会. 2014. [29] 倪雅真. 儿童文本句子相似度指标及可读性公式建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2014. [30] 蔡筱倩. 儿童文本词频词汇指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013. [31] 黄勇媜. 儿童文本语词重复指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013. [32] 陈文兰. 儿童文本关联词指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013. [33] 陈建宏. 儿童文本词类指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013. [34] 蔡亚韦. 儿童文本潜在语义指标分析系统建置与应用[硕士学位论文]. 台中: 国立台中教育大学, 2013. [35] 叶静如. 中文文本词汇多样化自动化分析系统建置与探讨[硕士学位论文]. 台中: 国立台中教育大学, 2014. [36] 宋曜廷, 陈茹玲, 李宜宪, 等. 中文文本可读性探讨: 指针选取, 模型建立与效度验证[J]. 中华心理学刊, 2013, 55(1): 75-106. [37] 别小雷. 基于“新大纲”的《新实用汉语课本》语料难度定量分析[硕士学位论文]. 成都: 西南交通大学, 2017. [38] 吕禾. 新旧HSK词汇大纲比较研究[J]. 黑龙江社会科学, 2012(4): 134-136. [39] 刘又辛. 汉语汉字答问[M]. 北京: 商务印书馆, 1997: 65. [40] 索绪尔. 普通语言学教程[M]. 北京: 商务印书馆, 1980: 50-51. [41] 张旺熹. 从汉字部件到汉字结构——谈对外汉字教学[J]. 世界汉语教学, 1990(2): 112-120. [42] 邢红兵. 《(汉语水平)汉字等级大纲》汉字部件统计分析[J]. 世界汉语教学, 2005(2): 49-55. [43] 陈小雨. 泰国学生HSK4级中多音字教学研究[硕士学位论文]. 天津: 天津师范大学, 2018. [44] 胡裕树, 许宝华. 现代汉语[M]. 上海: 上海教育出版社, 1981. [45] 葛本仪. 现代汉语词汇学[M]. 第三版. 北京: 商务印书馆, 2014: 28. [46] 吕叔湘. 说 “自由” 和 “粘着”[J]. 中国语文, 1962(1): 1-6. [47] 刘中富. 现代汉语词汇特点初探[J]. 东岳论丛, 2002(6):138-142. [48] 万宇,朱颖华.汉语儿童阅读分级与西方阅读分级的差异研究[J]. 图书馆杂志, 2016, 35(5):106-111. [49] 黄伯荣, 廖序东. 现代汉语[M]. 修订版. 兰州: 甘肃人民出版社, 1983. [50] 刘月华. 实用现代汉语语法[M]. 增订本. 北京: 商务印书馆, 2001: 3. [51] 朱德熙. 语法讲义[M]. 北京: 商务印书馆. 1982: 22-23. [52] 邵敬敏. 现代汉语通论[M]. 上海: 上海教育出版社, 2001. [53] 池昌海. 现代汉语语法修辞教程[M]. 杭州: 浙江大学出版社, 2014:120. [54] 李庆荣. 现代实用汉语修辞[M]. 修订版. 北京: 北京大学出版社, 2010: 3. [55] 聂仁发. 现代汉语语篇研究[M]. 杭州: 浙江大学出版社, 2009: 124-129. [56] 艾伟. 汉字问题[M]. 北京: 商务印书馆. 2017: 12-15. [57] 喻柏林,冯玲,曹河圻, 等. 汉字和人工“字”部件识别的比较研究[J]. 心理科学,1991(5): 1-5. [58] 许慎. 说文解字[M]. 北京: 线装书局, 2016: 1670-1675. [59] 潘文. 现代汉字的定义及其结构方式[J]. 南京师范大学文学院学报, 2001(4): 82-87. [60] Sun C C, Hendrix P, Ma J, et.al. Chinese Lexical Database (CLD) : A Large-scale Lexical Database for Simplified Mandarin Chinese[J]. Behavior Research Methods. 2018. [61] 国家汉语水平考试委员会办公室考试中心. 汉语水平词汇与汉字等级大纲[M]. 北京: 经济科学出版社. 2001. [62] 国家汉语水平考试委员会办公室考试中心, 孔子学院总部. 新汉语水平考试大纲[M]. 北京: 商务印书馆, 2009. [63] 国家语言文字工作委员会. 现代汉语常用字表[M]. 北京: 语文出版社, 1988. [64] 董振东, 董强, 郝长伶. 知网的理论发现[J]. 中文信息学报, 2007, 21(4): 3-9. [65] 《现代汉语常用词表》课题组. 现代汉语常用词表[M]. 北京: 商务印书馆出版社, 2008. [66] 马芝兰. 现代汉语语法的综合研究[M]. 北京: 中国书籍出版社, 2016:43. [67] McCarthy P M, Jarvis S. MTLD, Vocd-D, and HD-D: A Validation Study of Sophisticated Approaches to Lexical Diversity Assessment[J]. Behavior Research Methods, 2010, 42(2): 381-392. [68] 吴勇毅, 吴中伟, 李劲荣. 实用汉语教学语法[M]. 北京: 北京大学出版社, 2016. [69] 国家对外汉语教学领导小组办公室汉语水平考试部. 汉语水平等级标准与语法等级大纲[M]. 北京: 高等教育出版社, 1996. [70] 苏新春. 现代汉语分类词典[M]. 北京: 商务印书馆, 2013: 343-345. [71] 俞士汶, 朱学峰. 现代汉语语法信息词典[DB/OL]. 北京大学开放研究数据平台, V3. 2017, http://dx.doi.org/ 10.18170/DVN/EDQWIL. [72] 苗夺谦. 中文信息处理原理及应用[M]. 第二版. 北京: 清华大学出版社, 2015: 122. [73] Crossley S A, Greenfield J, McNamara D S. Assessing Text Readability Using Cognitively Based Indices[J]. Tesol Quarterly, 2008, 42(3): 475-493. [74] 张振亚, 王进, 程红梅, 等. 基于余弦相似度的文本空间索引方法研究[J]. 计算机科学, 2005, 32(9): 160-163. [75] Liu H, Xu C, Liang J. Dependency Distance: A New Perspective on Syntactic Patterns in Natural Languages[J]. Physics of Life Reviews, 2017, 21: 171-193. [76] 陆前, 刘海涛. 依存距离分布有规律吗？[J]. 浙江大学学报 (人文社会科学版), 2016: 1. [77] Manning C, Surdeanu M, Bauer J, et al. The Stanford CoreNLP Natural Language Processing Toolkit[C]. Proceedings of 52nd Annual Meeting of The Association for Computational Linguistics: System Demonstrations. 2014: 55-60. [78] Jieba中文分词组件[CP/OL]. (2018-12-3) [2019-4-8]. https://github.com/fxsjy/jieba. [79] Huang F. 哈工大PyLTP工具包[CP/OL]. [2019-4-8]. https://github.com/HIT-SCIR/pyltp. [80] Han H. HanLP开源汉语言处理包[CP/OL]. [2019-4-1]. https:// github.com/hankcs/HanLP. [81] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]. Advances in Neural Information Processing Systems. 2013: 3111-3119. [82] 腾讯AI开放平台[CP/OL]. [2019-2-24]. https://ai.qq.com/. [83] Jones K S. A Statistical Interpretation of Term Specificity and Its Application in Retrieval[J]. Journal of Documentation, 2004, 28(1):493-502. [84] 周有光. 现代汉字学发凡[J]. 语文现代化丛刊, 1980 (2). [85] 中华人民共和国教育部. 义务教育语文课程标准[M]. 北京: 北京师范大学出版社, 2011: 5. [86] 魏贞原. 机器学习Python实践[M]. 北京: 电子工业出版社, 2018: 3. ﹀
公开日期：	2022-06-10

医学英语词典的研究与设计.尹梦佳

链接

题名：	医学英语词典的研究与设计
姓名：	尹梦佳
学号：	1601210822
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-27
关键词：	医学英语词汇词典个性化设计
论文摘要：	︿人类社会进入21世纪以来，科技的飞速发展带动了各个领域的不断进步，与此同时，人们在各个专业化领域的需求也在不断探索和前进。因此，传统的专业领域工具已经无法满足人们的需要。对于医学领域人士（医生、医学生、医学爱好者等）来说，一款高效的医学英语电子词典是他们工作、学习必不可少的好帮手。然而，纵观过往的医学英语电子词典产品，大多存在以下三个问题：第一，依附于通用型词典之上，对于医学领域的专业广度和深度拓展不够。词汇和相关内容的搜索展示仍停留在通用词水平，无法为专业人士提供符合他们专业程度的词汇查阅需求。第二，词典功能单一。当前市面上流行的许多医学电子词典都将功能局限在“查词”上，无法为专业医学人士日常所需的文献阅读和论文撰写提供较为便利的解决方案。第三，无法辨别医学用户的专业偏好。目前，多数常用电子词典对于医学用户的专业偏好并没有区分，从而无法从专业的科室角度为用户提供高效的查词体验,也无法为用户提供个性化的搜索和展示。为了解决上面三个问题，本研究首先进行了医学英语词汇的构成分析，研究英语词汇的特点，提升词典设计的合理性。然后从医学英语词汇联结的角度入手，研究了UMLS和SNOMED CT两大医学界较为权威的英语词汇系统，从中提取医学英语词汇之间的相互联系。并且，为了解决现存医学数据库中词汇科室不明的问题，本研究采取机器算法与人工校对相结合的方式，利用不同科室的核心词汇表，加上提取的词汇联系网络，将获得的医学词汇数据进行科室分类。最后，本研究结合个性化产品设计的思路，辅之以针对医学用户日常所需的便利功能，如写作助手和阅读助手，为医学领域的专业用户设计了一款个性化的医学英语电子词典。为了验证本研究设计的词典的实用性，本研究邀请了来自浙江大学医学院的专家与同学参与了有效性验证。实验表明，本研究设计的电子词典可以有效提高用户查词的效率。此外，词典系统中的辅助功能，如写作、阅读助手等，也帮助用户提高了学习与工作的效率。本研究设计的医学英语电子词典有效地解决了当前医学电子词典中存在的专业化和个性化问题，帮助用户提高了学习和工作效率，对医学英语电子词典的研究与发展有一定参考价值。﹀
外文摘要：	︿ Great changes have appeared in various fields with the rapid development of science and technology since the beginning of the 21st century. At the same time, the needs of human beings in various specialization fields are also being advanced. Therefore, traditional professional learning tools can no longer meet people's needs. For people in the medical field (doctors, medical students, medical enthusiasts, etc.), an efficient professional medical English dictionary is an essential helper for their work and study. However, when it comes to the past medical English dictionaries, there are following three problems in the most of these dictionaries: Firstly, these dictionaries are one part of ordinary dictionaries, in which users are difficult to find professional knowledge, especially for professional medical users. Some uncommon knowledge, such as diseases, treatments and so on, also attracted medical users. Secondly, most dictionaries nowadays have a single function of word-retrieval. It is difficult for a simple dictionary to satisfy medical professionals’ daily needs since they have to read and write professional literatures in work and study. Thirdly, all users are treated equally when using those dictionaries, which means that they are difficult to distinguish the professional preferences of medical users, nor is it possible to provide users with efficient word- retrieval and word-display experience. In order to solve the three problems above, this study firstly carries on the characteristic analysis of medical English vocabulary to improve the professionalism of dictionary design. Then, from the point of view of medical English vocabulary association, this paper studies the two authoritative English vocabulary systems, UMLS and SNOMED CT, from which the relationship between medical English vocabulary is extracted. Moreover, in order to solve the problem that vocabulary departments in the existing medical database are unclear, this study adopts the way of combining machine algorithm with manual proofreading, uses the core vocabulary of 18 departments, and adds the extracted vocabulary connection network. The obtained medical vocabulary data thus are classified into sections. Finally, combined with the idea of personalized product design, this study designs a personalized medical English dictionary for professional users in the medical field, supplemented by some convenient functions for medical users. To verify the practicability of the medical English dictionary designed in this study, experts and students from School of Medicine in Zhejiang University were invited to participate in the verification of validity. The results of experiments show that the dictionary system designed in this study can improve the efficiency of word-retrieval. In addition, other auxiliary functions in the dictionary system, such as writing and reading assistant, also help users to study and work more efficiently. The medical English dictionary designed in this study has solved the specialization and individualization problems existed in the current medical dictionaries, and it also helps users study and work efficiently. What’s more, the study has certain reference value for the further development of the medical English dictionary design. ﹀
分类号：	TP3
论文总页数：	79
参考文献总数：	36
参考文献列表：	︿戴远君, 徐海. 电子词典研究现状与展望[J]. 辞书研究. 2014. 郭玉峰, 刘保延, 崔蒙, 李平, 杨阳. SNOMED CT内容简介[J]. 中国中医药信息杂志. 2006. 何志兰, 崔杜武. 一种新的用于电子词典的数据压缩算法[J]. 计算机工程, 2005(21): 186-188. 黄艺锋, 闫巧. 基于Android平台电子词典的设计与实现[J]. 计算机应用, 201l(31): 228-232. 黄永, 陆伟, 程齐凯, 等. 学术文本的结构功能识别——在学术搜索中的应用[J]. 情报学报, 2016, 35(4): 425-431. 黄永, 陆伟, 程齐凯, 等. 学术文本的结构功能识别——基于段落的识别[J]. 情报学报. 2016, 35(5): 530-538. 孔行. 基于主题推荐的辅助写作系统[D]. 哈尔滨: 哈尔滨工业大学, 2015. 雷声伟, 陈海华, 黄永, 等. 学术文献引文上下文自动识别研究[J]. 图书情报作, 2016(17): 78-87. 梁春成, 邢洪波. 电子词典在单片机系统中的应用方法[J]. 微计算机应用, 2001(5): 318-321. 陆伟, 黄永, 程齐凯, 等. 学术文本的结构功能识别——功能框架及基于章节标题的识别[J]. 情报学报, 2014(9): 979-985. 伊马木·达吾提. 电子词典数据压缩算法的设计与实现[J]. 信息与电脑(理论版), 2010(8):110-111. 买日旦·吾守尔, 维尼拉·木沙江. 电子词典软件系统中对维、哈、柯文进行自动判别技术的研究[J]. 新疆大学学报(自然科学版), 2011(1): 88-92. 穆念伟. 医学英语词汇特点[J]. 中华医学写作杂志. 2004, 11(23): 2035-2040. 荣岩. 医学英语词汇学习系统研究与设计[D]. 北京: 北京大学, 2018. 孙枫军. 引文上下文中的概念抽取[D]. 北京: 中国科学技术信息研究所, 2012. 田莺, 杨中华. 基于Qt/Embedded的电子词典的设计与实现[J]. 信息化纵横, 2009(14): 2l-23. 王世杰, 赵玉华, 吴永胜. 基于语料库的医学英语词汇[M]. 兰州: 兰州大学出版社, 2013: 1 吴鹏, 李灵华. 实现基于Google Android平台的电子词典相关技术探讨[J]. 电脑知识与技术, 2011(34): 8876-8878. Yang. P C. WriteAhead: 以学术论文写作为目的之摘要写作辅助系统[D]. 清华大学资讯系统与应用研究所学位论文, 2009: 1-55 杨明山. 医学英语术语教程[M]. 上海: 上海中医药大学出版社, 2006. 张金松. 基于引文上下文分析的文献检索技术研究[D]. 大连: 大连海事大学, 2013. 章宜华. 计算词典学与新型词典[M]. 上海: 上海辞书出版社, 2004. 周晓音. SNOMED CT在临床路径中应用探讨[J]. 医学信息学杂志, 2010. Angroshm A, Cranefields, Stabger N. Context identification of sentences in related work sections using a conditional random field: towards intelligent digital libraries[C]. Proceedings of the 10th annual joint conference on Digital libraries, Gold Coast: ACM, 2010: 293-302. Atkins B T S, Rundell M. The Oxford Guide to Practical Lexicography[M]. New York: Oxford University Press, 2008. Bejoint H. The Lexicography of English. Oxford: Oxford University Press, 2010. Chen M H, Huang S T, Hsieh H T, et al. Flow: a first-language-oriented writing assistant system[J]. ACL System Demonstrations, 2012, 24(3): 157-162. Chen Y. Dictionary Use and EFL Learning: A Contrastive Study of Pocket Electronic Dictionaries and Paper Dictionaries[J]. International Journal ofLexicography, 2010(3): 275-306. Chen Y. Studies on Bilingualized Dictionaries: The User Perspective[J]. International Journal of Lexicography, 2011(2): 161-197. De Schryver G M. Lexicographers’ Dreams in the Electronic-Dictionary Age[J]. International Journal of Lexicography, 2003(2): 143-199. Dziemianko A. Paper or Electronic: The Role of Dictionary Form in Language Reception, Production and the Retention of Meaning and Collocations[J]. International Journal of Lexicography, 2010(3): 257-273. Frankenberg-Garcia A. Learners’ Use of Corpus Examples[J]. International Journal of Lexicography, 2012(3): 273-296. Granger S, Paquot M. Electronic Lexicography[M]. Oxford: Oxford University Press, 2012. Hacken P T, Abel A. Knapp J. Word Formation in an Electronic Learners’Dictionary: ELDIT[J]. International Journal ofLexicography, 2006(3): 243-256. Michael C. Searching and mining the web for personalized and specialized information[D]. Tempe: The University of Arizona, 2003. Svensen B A. Handbook of Lexicography: The Theory and Practice of Dictionary-Making[M]. Cambridge: Cambridge University Press, 2009. ﹀
公开日期：	2019-06-04

多维度智能英语词汇学习知识库研究.屠少辉

链接

题名：	多维度智能英语词汇学习知识库研究
姓名：	屠少辉
学号：	1601210727
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2019-05-27
外文题名：	Study on Multi-dimensional Knowledge Base System for English Vocabulary Intelligent Learning
关键词：	英语词汇学习多维度知识库资源加工
外文关键词：	English vocabulary learning Multi-dimensional Knowledge base Resource processing
论文摘要：	︿词汇教学是英语教学的重要组成部分。在词汇教学中，教学材料直接影响到学生的学习效果。经过调研，发现虽然目前可使用的英语词汇学习资料丰富多样，但是内容分散，良莠不齐，无法完全满足教师和学生的资源需求。如果能够发挥计算机和互联网技术的优势，对词汇学习材料进行采集和整合，将对英语词汇教学质量的提升有所助益。为整合英语词汇教学中的资源，本研究广泛地收集词汇学习材料，使用计算机技术对文本进行加工，根据一定逻辑对素材进行组织，建设英语词汇学习知识库。本文研究的关键问题有知识库建设的方法和思路、大规模资源汇总、资源加工存储的规范标准和自动处理程序的设计与实现。本研究从互联网上采集了大量词汇学习资源，进行汇总，建设资源库。之后在文献分析的基础上构建词汇知识模型，并以此作为整合资源的逻辑依据。词汇知识模型从习得过程和知识内容两个角度出发组织词汇知识，响应了学习者在习得词汇过程中词汇知识需求的动态变化。模型以词汇习得过程为框架，词汇知识为内容，将词汇习得过程分为感知、理解、联想和输出四个阶段，让知识内容聚合到音位、形式、语境、语义、搭配、词源、产出和主题八个维度下。在建设知识库时，根据词汇知识模型设计知识库的结构和资源加工的规范标准，对例句和搭配的自动抽取等关键问题进行研究。最终，本研究整合了通识英语教学中的词汇学习资源，解决了知识库建设中的关键问题，开发了自动处理程序，实现了一定规模的词汇学习知识库。对于整合后的词汇学习资源，本文进行了客观指标评估、准确率抽样检查和场景检查，证明了知识库能够满足教师和学生的资源需求，帮助改善词汇教学的效果。本研究的创新之处包括以下三点：1）在文献研究的基础上构建词汇知识模型，根据知识模型从各类词汇学习资料中针对性地提取内容，然后进行整合，保证了内容的丰富性和资源间的关联性，去掉了重复性和低质量的学习材料，是一种语言学习资源建设的新思路；2）知识库将学习资源聚合到不同的维度下，实现了资源的模块化，在应用知识库时，可以根据需求灵活地调用和组织内容；3）本研究在加工资源时，对自动处理程序中的关键问题进行研究，使用自然语言处理等计算机技术提高资源建设的效率。词汇教学是英语教学的重要组成部分。在词汇教学中，教学材料直接影响到学生的学习效果。经过调研，发现虽然目前可使用的英语词汇学习资料丰富多样，但是内容分散，良莠不齐，无法完全满足教师和学生的资源需求。如果能够发挥计算机和互联网技术的优势，对词汇学习材料进行采集和整合，将对英语词汇教学质量的提升有所助益。为整合英语词汇教学中的资源，本研究广泛地收集词汇学习材料，使用计算机技术对文本进行加工，根据一定逻辑对素材进行组织，建设英语词汇学习知识库。本文研究的关键问题有知识库建设的方法和思路、大规模资源汇总、资源加工存储的规范标准和自动处理程序的设计与实现。本研究从互联网上采集了大量词汇学习资源，进行汇总，建设资源库。之后在文献分析的基础上构建词汇知识模型，并以此作为整合资源的逻辑依据。词汇知识模型从习得过程和知识内容两个角度出发组织词汇知识，响应了学习者在习得词汇过程中词汇知识需求的动态变化。模型以词汇习得过程为框架，词汇知识为内容，将词汇习得过程分为感知、理解、联想和输出四个阶段，让知识内容聚合到音位、形式、语境、语义、搭配、词源、产出和主题八个维度下。在建设知识库时，根据词汇知识模型设计知识库的结构和资源加工的规范标准，对例句和搭配的自动抽取等关键问题进行研究。最终，本研究整合了通识英语教学中的词汇学习资源，解决了知识库建设中的关键问题，开发了自动处理程序，实现了一定规模的词汇学习知识库。对于整合后的词汇学习资源，本文进行了客观指标评估、准确率抽样检查和场景检查，证明了知识库能够满足教师和学生的资源需求，帮助改善词汇教学的效果。本研究的创新之处包括以下三点：1）在文献研究的基础上构建词汇知识模型，根据知识模型从各类词汇学习资料中针对性地提取内容，然后进行整合，保证了内容的丰富性和资源间的关联性，去掉了重复性和低质量的学习材料，是一种语言学习资源建设的新思路；2）知识库将学习资源聚合到不同的维度下，实现了资源的模块化，在应用知识库时，可以根据需求灵活地调用和组织内容；3）本研究在加工资源时，对自动处理程序中的关键问题进行研究，使用自然语言处理等计算机技术提高资源建设的效率。﹀
外文摘要：	︿ vocabulary teaching holds an important position in english language teaching and the materials directly influence the learning effect of students. it was found that although the english vocabulary learning resources currently available are rich and varied after surveying teachers and students, problems do exists with the quality and organization of these materials which fail to fully meet the resource needs of teachers and students. if vocabulary learning materials were collected and integrated with computer and internet technologies, the quality of english vocabulary teaching could be improved. in order to integrate resources in english vocabulary teaching, the author collected quantities of vocabulary learning materials, applied computer technology to process texts, organized materials according to certain logic, finally realized an english vocabulary learning knowledge base. the key issues in this study are the methods and ideas of knowledge base construction, gathering large-scale resources, standardizing the resource processing and storage, and designing automatic text processing programs. this study collected a large amount of vocabulary learning resources from the internet based on which a resource library was built. then, after the literature analysis, the author proposed a vocabulary knowledge model as the logical basis for integrating resources. the vocabulary knowledge model organizes vocabulary knowledge from the perspectives of acquisition process and knowledge content, responding to the dynamic changes of vocabulary knowledge needs of learners in the procedure of acquiring vocabulary. the model takes the vocabulary acquisition process as the framework and the vocabulary knowledge as the content. the vocabulary acquisition process is divided into four stages: perception, understanding, association and output. the knowledge content is aggregated into the eight dimensions of phoneme, word form, context, semantic, collocation, topic, source and output. when constructing the knowledge base, the knowledge base structure and the normative standards of resource processing were designed according to the vocabulary knowledge model, and key issues such as automatic extraction of example sentences and collocations were studied. in the end, this study integrated the vocabulary learning resources in general english teaching, solved the key problems in the construction of knowledge base, developed automatic processing programs, successfully built a vocabulary learning knowledge base of a certain scale. for the integrated vocabulary learning resources, this paper conducted ive index evaluation, accuracy sampling and scene inspection, which proved that the knowledge base could meet the resource needs of teachers and students, improving the effect of vocabulary teaching. the innovations of the study include the following three points: 1) by extracting and integrating resources from various vocabulary learning materials according to the knowledge model, the study can ensure the richness and connection of the resources in the knowledge base, moreover, remove the repetitive and low-quality learning materials. this is a new way to build language learning resources; 2) the knowledge base aggregates learning resources into different dimensions and realizes the modularization of resources. when applying the knowledge base, the users or applications can flexibly invoked and organized the resources according to their requirements; 3) this paper studied the key issues in the automatic processing program when integrating resources, and used computer technology such as natural language processing to improve the efficiency of resource construction. ﹀
分类号：	TP3
论文总页数：	53
参考文献总数：	70
参考文献列表：	︿ [1] zimmerman c b. historical trends in second language vocabulary instruction // j. coady, t. huckin (eds.) second language vocabulary acquisition. cambridge: cambridge university press. 1997:146-163. [2] 吕菲, 齐聪. 初中英语词汇教学研究述评[j]. 教育现代化, 2018, 5(40):391-393. [3] 邓亿书, 汤成强. 移动互联网视野下高中英语教育资源库建设探究[j]. 科学咨询(科技·管理), 2018, 602(09):114. [4] 陈旭辉. 技术支持下的初中英语词汇教学[j]. 内蒙古师范大学学报, 2012, 25(6):125-127. [5] 俞士汶, 朱学锋. 综合型语言知识库及其在语言教学中的应用[j]. 北华大学学报(社会科学版), 2014, 15(3):4-9. [6] 刘书昊. 基于网络平台的研究生专业英语词汇预习资源设计[d]. 沈阳师范大学. 2018. [7] 黄建滨. 关于《大学英语教学大纲(修订本)》词汇表的说明[j]. 外语界, 1999(4):27-31. [8] 马广惠, 黄文, 苗娟, 等. 大学非英语专业新生英语入学水平测试与分析[j]. 南京师大学报(社会科学版), 2006(1):82-88. [9] 吕长竑. 词汇量与语言综合能力、词汇深度知识之关系[j]. 外语教学与研究(外国语文双月刊), 2004(2):116-123. [10] 戴俊红. 非英语专业大学生四级阶段词汇量调查[j]. 重庆理工大学学报(社会科学), 2013(1):118-122. [11] 童淑华. 第二语言产出性词汇习得研究[m]. 吉林: 吉林大学出版社, 2010. [12] 何道瑞. 新课标下高中英语词汇教学新思路[j]. 北京: 考试(教研版), 2006. [13] 叶哲琳. 新课标下高中英语词汇教学新思路[j]. 石家庄: 校园英语, 2018(20). [14] 管淑红. 大学英语词汇使用现状研究[j]. 华东交通大学学报, 2004, 21(3):132-135. [15] 毛浩然, 林晓琴. 高频优先、共同经验、成功体验--词汇教与学策略新探[j]. 基础教育外语教学研究, 2006(11):21-24. [16] 戴雪莹. 高中英语词汇教学现状调查分析[d]. 东北师范大学, 2009. [17] wilkins d a. linguistics in language teaching [m]. london: edward arnold. 1972. [18] hatch, e. and c. brown. vocabulary , semantics, and language education [m ]. cup, 1995. [19] 陈新仁. 外语词汇习得过程探析[j]. 外语教学, 2002, 23(4). [20] skehan p. a cognitive approach to language learning [m]. 上海外语教育出版社, 1999. [21] garcia p. input, interaction, and the second language learner [m] // input, interaction, and the second language learner. lawrence erlbaum associates, 1997. [22] schmitt n, m mccarthy. vocabulary: deion, acquisition and pedagogy [c]. cambridge: cup, 1997. [23] coady j, t. huckin. second language vocabulary acquisition [c]. cambridge: cup, 1997. [24] nation p. learning vocabulary in another language [m]. cambridge: cup , 2001. [25] read j. assessing vocabulary [m]. cambridge: cup, 2000. [26] thornbury s. how to teach vocabulary [m]. longman, 2002 . [27] nation p. teaching and learning vocabulary [m]. new york: newbury house publishers, 1990. [28] 张文忠, 吴旭东. 课堂环境下二语词汇能力发展的认知心理模式[j]. 现代外语, 2003, 26(4):373-384. [29] 刘绍龙. 论二语词汇的习得与发展--基于实证调查的词汇知识发展差异假说[j]. 外语教学, 2003, 24(6):47-50. [30] jiang n. lexical representation and development in a second language [j]. applied linguistics, 2000(1):47-77. [31] read j. the development of a new measure of l2 vocabulary knowledge [j]. language testing, 1993(10):355-371 [32] 李红. 第二语言语义提取中的词汇知识效应[j]. 现代外语, 2003, 26(4). [33] wilks c, meara p. untangling word webs: graph theory and the notion of density in second language word association networks [j]. second language research, 2002, 18(4):303-324. [34] 马广惠. 二语词汇知识理论框架[j]. 外语与外语教学, 2007(4). [35] atkinson r c, shiffrin r m. human memory: a proposed system and its control processes [j]. psychology of learning & motivation, 1968, 2:89-195. [36] craik f i m, lockhart r s. levels of processing: a framework for memory research [j]. journal of verbal learning & verbal behavior, 1972, 11(6):671-684. [37] craik f i m. depth of processing and the retention of words in episodic memory [j]. journal of experimental psychology general, 1975, 104(3):268-294. [38] wttrock m c. learning as a generative process [j]. educational psychologist, 1974, 11/1. [39] hulstijn j h. retention of inferred and given word meanings: experiments in incidental vocabulary learning [m] // vocabulary and applied linguistics. 1992. [40] 桂诗春. 以语料库为基础的中国学习者英语失误分析的认知模型[j]. 现代外语, 2004, 27(2):129-139. [41] tinkham s f, weaver lariscy r a. a diagnostic approach to assessing the impact of negative political television commercials [j]. journal of broadcasting & electronic media, 1993, 37(4):377-399. [42] carter r, mccarthy m. written and spoken vocabulary [a]. cambridge: cambridge university press, 1997. [43] 韦萍. 英语词缀与词汇学习[j]. 海南热带海洋学院学报, 2011, 18(1):146-147. [44] 麦莹莹. 英语词根学习的重要性分析[j]. 开封教育学院学报, 2014(2):143-144. [45] richards j. the role of vocabulary learning [ j ]. tesol quarterly, 1976(10) . [46] 范琳, 王庆华. 英语词汇学习中的分类组织策略实验研究[j]. 外语教学与研究(外国语文双月刊), 2002, 34(3):209-212. [47] krashen s d, terrell t d . the natural approach: language acquisition in the classroom.[m]. the alemany press, p.o. box 5265, san francisco, ca 94101. 1983. [48] cohen a d, aphek e. retention of second-language vocabulary overtime: investigating the role of mnemonic associations[j]. system, 1980, 8(3):221-235. [49] 文秋芳. 英语学习策略论[m]. 上海: 上海外语教育出版社, 1996. [50] 桂诗春. 中国学生英语学习心理[m]. 湖南教育出版社, 1992. [51] 袁玲丽. 联想策略与直接词汇教学研究[j]. 西安外国语大学学报, 2005, 13(3):53-55. [52] nist s l, olejnik s. the role of content and dictionary definitions on varying levels of word knowledge [j]. reading research quarterly, 1995, 172-193. [53] laufer b. corpus-based versus lexicographer examples in comprehension and production of new words [m]. // fontenelle t. practical lexicography. oxford: oxford university press.2008:71-76. [54] 赵海威. 基于行为特征和数据分析的外语词汇学习模型研究[d].北京大学.2017. [55] 孙晓明. 第二语言词汇研究[m].中央民族大学出版社, 2006. [56] 李传秀. 词源教学在初中英语词汇教学中的效用研究[d]. 河北师范大学, 2014. [57] 韩秀莲, 刘立莉. 词源教学与大学英语词汇教学[j]. 卷宗, 2013(6):76-77. [58] wray a. formulaic language and the lexicon [j]. london: cambridge university press，2002. [59] altenberg b. on the phraseology of spoken english: the evidence of recurrent word combinations [a] // a cowie（ed.）. phraseology: theory，analysis and applications. oxford: oxford university press，1998, 101-108. [60] 张长岚. the lexicalapproach语境中的词汇[j]. 淮阴师范学院学报, 2000, 6: 109-111. [61] 丁言仁. 第二语言习得研究与外语学习[m]. 上海外语教育出版社, 2005. [62] 俞士汶, 段慧明, 朱学锋, 等. 综合型语言知识库的建设与利用[j]. 中文信息学报, 2004, 18(5):2-11. [63] 支流. 综合型语言知识库系统原型的开发与中文缩略语知识库建设[d]. 北京大学. 2008. [64] 陈楚祥. 词典评价标准十题[j]. 辞书研究, 1994(1):10-21. [65] 张后尘. 双语词典质量标准与质量保障对策[j]. 辞书研究, 1995(6):25-33. [66] 魏向清. 关于构建双语词典批评理论体系的思考[j]. 外语与外语教学, 2001(1). [67] 范凯. nosql数据库综述[j]. 程序员, 2010(6):76-78. [68] melo g d , weikum g. extracting sense-disambiguated example sentences from parallel corpora [j]. 2009. [69] kilgarriff a, baisa v, bušta j, et al. the sketch engine: ten years on[j]. lexicography, 2014, 1(1):7-36. [70] frank smadja, retrieving collocations from texts [j]: xtract, computational linguistics, 1993, 13-19. ﹀
公开日期：	2019-06-04

法律英语词汇学习系统研究与设计.包珍

链接

题名：	法律英语词汇学习系统研究与设计
姓名：	包珍
学号：	1601210427
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2019-05-27
外文题名：	Research and Design of a Legal English Vocabulary Learning System
关键词：	法律英语词汇学习资源词汇学习词汇复习词汇推荐策略
外文关键词：	Legal English vocabulary Learning resources Vocabulary Learning Vocabulary Review Vocabulary recommendation strategy
论文摘要：	︿随着我国涉外法务日益频繁，法律英语的重要性毋庸置疑。想学好法律英语需要先了解它的特殊性，根据前人研究，法律英语的特殊性首要体现在法律英语词汇的特殊性。本研究期望借助移动学习的优势，设计一个法律英语词汇学习系统。经调研，目前的法律英语词汇教学在学习效率方面存在以下问题未能得到有效解决。第一，目标词汇不成体系，未能建立词汇的属性标注和关联关系。学习资源以纸质教材为主，存在学习内容有限等问题；相关词汇学习系统仅将纸质教材中的词表电子化，未能建立词汇的属性标注和关联关系。第二，词汇学习深度不足，没有突出重难点。一般英语学习的基本词汇信息未能满足法律英语词汇学习的深度需求；没有重点设计近义辨析这一教学难点。第三，词汇考查内容不足，复现形式单一。考查内容只包括词汇的基本信息，缺乏法律英语词汇的特殊信息；仅以测试实现词汇复现，形式单一。第四，一般英语的词汇推荐策略不完全适用于法律英语词汇。词汇优先级计算忽视了法律英语词汇高频使用特别术语的特征；近义词学习未能实现动态推荐。为解决上述问题，本研究在相关词汇教学理论和二语习得理论的指导下，借助移动学习的优势，设计了法律英语词汇学习系统。第一，多维度整合学习资源确立目标词汇资源库，对词汇的词源和部门法等属性信息进行标注、建立近义词汇的关联关系。第二，根据法律英语学习的深度需求，增加构成要件等学习内容，重点设计近义辨析模块。第三，设计多种复习题型考查法律英语词汇的基本信息和特殊信息，多种法律语境复现法律英语词汇。第四，结合法律英语词汇特征和学习者的学习情况，融入初始熟悉度因子综合计算词汇优先级，优化近义词推荐策略。对于资源建设和词汇推荐策略，本研究邀请专家通过访谈的形式验证了设计的合理性和有效性。对于功能设计部分，通过对学习者进行问卷调查和深度访谈的方式验证了本系统在认知负荷相似的情况下，在学习目标达成方面优于现有的词汇学习系统。本研究设计的法律英语词汇学习系统抓住了法律英语词汇的特征和教学重难点，发挥移动学习的优势，克服了纸质资源的局限性，作为课堂教学的有益补充，帮助学习者高效学习法律英语词汇。﹀
外文摘要：	︿ with the increase of china’s foreign legal affairs, legal english plays a more important role. if students want to learn legal english well, above all, they are supposed to know the specialty of legal english, which mainly reflected by its vocabulary. this study aims to design a legal english vocabulary learning system with the help of mobile learning. when it comes to the efficient learning of legal english vocabulary, there are still several problems. first of all, learning resources have not been integrated effectively. the number of vocabularies in paper textbooks is limited; current systems either label attribute or make the association of synonyms. secondly, current systems fail to meet the learning depth of legal english vocabulary and to design the discrimination of synonyms as a key point. thirdly, the review content of current systems is unable to meet the review needs of those students who have different learning ives. and the review form of current systems is only based on examinations, which is prone to make students feel boring. fourth, the vocabulary recommendation strategy of current systems is not suitable for legal english vocabulary. the difficulty grading system ignores the influence of the learner's previous english level. in addition, the recommendation of synonyms can’t be adjusted dynamically. this study tries to solve these problems by following ways. first, this study multi-dimensionally integrates learning resources and labels the attribute of legal english vocabulary to meet the needs of different students at the learning portal. second, this study selects the learning content according to the characteristics of legal english vocabulary, trying to highlight the difficulties of legal english vocabulary teaching. third, this study designs a variety of questions to meet the different review needs of learners and makes learned vocabulary repeat in different legal contexts to stimulate students’ interests. fourth, this study takes students’ former english level into consideration to make an effective learning sequence for legal english vocabularies and different study strategies for synonyms. this study invites experts to evaluate the resource integration part and vocabulary recommendation part by interviews, which verifies the rationality and effectiveness of the design. in addition, this study conducts an experiment on legal english vocabulary learners by questionnaires and interviews, which shows that the cognitive load of the system designed in this study is similar to that of the existing systems, but the achievement of learning goals is more conductive. the legal english vocabulary learning system designed in this study captures the characteristics of legal english vocabulary and teaching key points, overcomes the limitations of paper resources by virtue of the advantages of mobile learning, and serves as a useful supplement to in-class teaching, helping students to learn legal english vocabulary efficiently. ﹀
分类号：	H08
论文总页数：	69
参考文献总数：	64
参考文献列表：	︿曹飞. 2007. 法律英语教学的基本定位及其案例教学法[j]. 黑龙江省政法管理干部学院学报, (04):124-125. 陈庆柏. 2006. 涉外经济法律英语. 北京:法律出版社. 杜金榜. 2004. 法律语言学. 上海:上海外语教育出版社. 龚德英. 2009. 多媒体学习中认知负荷的优化控制[d]. 西南大学. 侯萍英. 2010. 法律英语文本的结构和词汇特点[j]. 法制与社会, (24):226+237. 侯天友. 2009. 浅谈法律英语的词汇特点[j]. 读与写(教育教学刊), 6(04):31-32. 黄振中, 夏扬. 2010. 法律英语教学的困境与改革[j]. 中国大学教学, (04):48-51. 寇俊瑜. 2013. 基于法律英语词汇特点的法律英语词汇翻译策略研究[j]. 高教学刊, (15):188-189+193. 冷帅, 苏晓凌, 董燕清, 栾姗, 刘克江. 2017. 中国涉外法律服务业探析(上)[j]. 中国律师, (05):73-76. 李冰. 2007. 语义场理论与法律英语词汇教学[j]. 科教文汇(上旬刊), (06):68-69. 李剑波. 2003. 论法律英语的词汇特征[j]. 中国科技翻译, (02):16-21. 李克兴, 张新红. 2006. 法律文本与法律翻译. 北京:中国对外翻译出版公司. 李勤. 2011. 试论需求分析理论框架下的大学英语esp教学[j]. 云南财经大学学报(社会科学版), 26(04):146-147. 李艳燕. 2015. 法律英语词汇教学策略研究[j]. 当代教育实践与教学研究, (06):53+52. 刘雪洁. 2016. 图式理论框架下高中英语同义词学习困难原因的实证研究[d]. 哈尔滨师范大学. 刘瑶. 2014. 语义场理论在初中英语词汇教学中的应用研究[d]. 南京师范大学. 马雯. 2013. 法律英语词法特征探微──同(近)义词的并置及翻译[j]. 佳木斯教育学院学报, (11):410-412. 庞维国. 2011. 认知负荷理论及其教学涵义[j]. 当代教育科学, (12):23-28. 荣岩. 2018. 医学英语词汇学习系统研究与设计[d]. 北京大学. 阮绩智. 2009. esp需求分析理论框架下的商务英语课程设置[j]. 浙江工业大学学报(社会科学版), 8(03):323-327+344. 沙丽金. 2005. 法律英语教学中的图式理论应用[j]. 郑州航空工业管理学院学报(社会科学版), (03):115-116. 沙丽金. 2005. 以语境理论为基础的法律英语词汇教学[j]. 浙江万里学院学报, (03):155-157. 束定芳. 2000. 现代语义学. 上海:上海外语教育出版社. 司国东, 宋鸿陟, 赵玉. 2013. 认知负荷理论基础上的移动学习资源设计策略研究[j]. 中国远程教育, (09):88-92. 宋雷, 张绍全. 2010. 英汉对比法律语言学：法律英语翻译进阶. 北京:北京大学出版社宋雷. 2011. 法律术语翻译要略：正确使用法律英语同义、近义词语. 北京:中国政法大学出版社. 孙崇勇, 刘电芝. 2013. 认知负荷主观评价量表比较[j]. 心理科学, 36(01):194-201. 王璐. 2013. 基于本体的个性化推荐系统[d]. 电子科技大学. 王青梅. 2003. 法律英语教学模式的探索——以案例教学法为例[j]. 宁波大学学报(教育科学版), (05):111-112+139. 杨惠中. 2002. 语料库语言学导论. 上海:上海外语教育出版社. 杨硕, 李想, 刘红蕾. 2018. 基于esp词汇分级系统下的大学英语教学研究——以林业科学相关专业为例[j]. 科教导刊(中旬刊), (09):59-61+74. 杨彦军, 郭绍青. 2012. e-learning学习资源的交互设计研究[j]. 现代远程教育研究, (01):62-67. 叶家春,曾杰. 2016. 英语词汇教学的多模态—认知策略模式[j]. 教育评论, (08):127-130. 张法连. 2008. 法律英语词汇研究. 北京:中国方正出版社. 张法连. 2013. 法律英语术语双解. 北京:中国法制出版社. 张海征. 2016. 高校法律英语教学的现状和对策[j]. 课程教育研究, (07):116-117. 张金会. 2006. 法律英语中的案例教学[j]. 吉林农业科技学院学报, (04):56-58. 赵艳平. 2015. 高中英语词汇自适应学习系统的设计与开发[d]. 山东师范大学. 周红. 2008. 论法律英语的词汇特征[j]. 广西政法管理干部学院学报, (03):119-123. bartlett f.c. 1932.remembering: a study in experimental and social psychology. london: cambridge university press. brunken r., plass j. l., leutner d. 2003. direct measurement of cognitive load in multimedia learning. educational psychologist, 38(1), 53-61. carrell p.l., eisterhold j.c. 1983. schema theory and esl reading pedagogy[j]. tesol quarterly, 17(4):553-573. chen c.m., chung c. j. 2008. personalized mobile english vocabulary learning system based on item response theory and learning memory cycle[j]. computers & education, 51(2):0-645. ellis h. 1965.the transfer of learning[m]. newyork: macmillan． ho c.f. 1992. a method of automatic adjusting the timing of review based on a microprocessor built-in device for user to memory strings. tw, 200519622, 1992-12-4 hsieh, t.c., wang t.i., su c.y., lee m.c. 2012. a fuzzy logic-based personalized learning system for supporting adaptive english learning. educational technology & society, 15 (1), 273–288. hutchinson t, waters a. 1993. english for special purposes: a learning-centered approach[m]. cambridge: cambridge university press. kalyuga s., chandler p., sweller j. 1999. managing split‐attention and redundancy in multimedia instruction. applied cognitive psychology: the official journal of the society for applied research in memory and cognition, 13(4), 351-371. kilgarriff a., husák m., mcadam k., rundell m., rychlý p. 2008. gdex: automatically finding good dictionary examples in a corpus. in proc. euralex. barcelona. krashen s. d. 1985. the input hypothesis: issues and implication. london: longman. laufer b. 2006. comparing focus on form and focus on forms in second-language vocabulary learning. canadian modern language review, 63(1), 149-166. laufer b., paribakht t.s. 1998. the relationship between passive and active vocabularies: effects of language learning context[j]. language learning, 1998, 48(3):365-391. mellinkoff, d. 1963. the language of the law. boston: little, brown and company． nagy w. 1995. on the role of context in first- and second-language vocabulary learning. vocabulary deion acquisition & pedagogy, 24. nation p. 1982. beginning to learn foreign vocabulary: a review of the research. relc journal, 13(1), 14-36. nation p. 1990. teaching and learning vocabulary. boston: heinle & heinle. nation p., waring r. 1997. vocabulary size, text coverage and word lists. in schmitt, n, mccarthy m. (eds.), vocabulary deion, acquisition, pedagogy. nist s l., olejnik s. 1995. the role of content and dictionary definitions on varying levels of word knowledge [j]. reading research quarterly, 172-193. paas f.g., van merriënboer j.j., adam, j.j. 1994. measurement of cognitive load in instructional research. perceptual and motor skills, 79(1), 419-430. richard j c. 1969. a psycholinguistic measure of vocabulary selection[j]. iral, 8(2):87-102. richterich r., chancerel l. 1980. identifying the needs of adults learning a foreign language[m]. oxford: pergamon press. sinclair j., renouf a. 1988. a lexical syllabus for language learning. in carter, r. and mccarthy, m., editors, vocabulary and language teaching. london: longman, 140–60. swain m. 1993. the hypothesis: just speaking and writing aren't enough[j]. the canadian modern language review 50:158-164. sweller j. 1988. cognitive load during problem solving: effects on learning. cognitive science, 12(2), 257-285. ﹀
公开日期：	2019-06-04

基于思考帽理论的合作探究教学设计与实证.陈钗平

链接

题名：	基于思考帽理论的合作探究教学设计与实证
姓名：	陈钗平
学号：	1601210453
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2019-05-27
关键词：	六顶思考帽支架式理论合作探究教学思辨能力
论文摘要：	︿信息时代，合作探究教学模式能够给予学生更充分的思考空间和更丰富的思维训练，逐渐成为教育教学中的热门研究话题。在合作探究教学中，主要由学生小组合作展开讨论学习，教师辅助进行指导。众多教育研究者提出合作探究教学是促进学生思辨能力的有效方式。然而，该模式在实际教学中面临了一些问题。笔者作为《翻译技术原理与实践》课程的助教，发现目前的合作探究教学存在三个问题：（1）如果缺少学生对彼此研究内容的评判或教师对学生发言观点的评价，难以培养学生思维的严谨性；（2）学生不能积极给予同伴建议或教师未能及时提供启发，将不利于训练学生思维灵活性；（3）若学生之间没有情绪表达沟通或教师未能提供积极的鼓励，容易导致学生的思维自主性难以被触动。针对上述问题，笔者阐述了在现有合作探究教学模式中引入六顶思考帽理论的必要性，明确了教学研究思路和研究方法，创新性地提出了基于六顶思考帽理论的合作探究教学模式。新模式主要包括：学生借助思考帽讨论流程强化小组讨论中的同伴互助作用，教师利用思考帽进行评价反馈增加小组讨论中的教师辅助引导作用。为验证新模式的有效性，笔者于2018年3月至6月对北京大学外国语学院的14名学生开展了教学实验，实验类型为单组前后测，前后测数据来源为学生的线上讨论记录和问卷调查，并通过定量分析和定性分析对教学实验前后测数据进行对比，验证基于思考帽理论的合作探究教学方法在实际课堂应用中的效果。实验结果表明，学生的思辨能力在灵活性、严谨性和自主性方面有提高，在一定程度上证明基于思考帽理论的合作探究教学对学生思辨能力提升有积极效果。该研究对合作探究教学模式进行了探索，为其进一步优化提供了新思路。﹀
分类号：	TP3
论文总页数：	60
参考文献总数：	51
参考文献列表：	︿高建凤. 体验—探究性教学在高师公共心理学教学中的应用研究[D]. 山东师范大学, 2008. 何克抗. "建构主义的教学模式, 教学方法与教学设计." 北京师范大学学报: 社会科学版 5 (1997): 74-81. 黄伟. 六顶思考帽在初中历史教学中的应用[J]. 辽宁教育研究, 2005 (9): 95-96. 李纯. 小组合作探究学习模式在初中信息技术教学中的应用研究[D]. 陕西师范大学, 2017. 李静. 基于核心素养的“支架式”教学在中学化学中的应用研究[D]. 河北师范大学, 2018. 刘芳. 自主合作探究学习模式下初中快速作文的研究与应用[D]. 东北师范大学, 2006. 钱海锋. 化学教学中的合作探究学习研究[D]. 苏州大学, 2007. 丘丽红. 支架理论在高职英语写作教学中的应用研究[D]. 闽南师范大学, 2018. 钱雯. 论建构主义理论及支架式教学法在对外汉语初级阶段口语课堂教学中的设计和应用[D]. 河北师范大学, 2012. 时培忠. 基于支架理论的高中英语写作教学研究[D]. 陕西理工大学, 2018. 王薇. 中小学生的创造性思维训练研究[D]. 河北师范大学, 2014. 吴文文. 试析高中生的历史批判性思维及其培养模式[D]. 温州大学, 2012. 王晓艳. 高中思想政治课自主·合作·探究学习的整合研究[D]. 河北师范大学, 2017. 俞敬松, 王华树. 计算机辅助翻译硕士专业教学探讨[J]. 中国翻译, 2010 (3): 38-42. 俞敬松, 陈泽松. 浅析 MOOC 与翻转课堂在“翻译技术实践”课程中的应用[J]. 工业和信息化教育, 2014(11):17-28. 于军烨. 支架式教学在小学科学课中的应用研究[D]. 聊城大学, 2018. 张威. 支架式教学理论在对泰汉语课堂教学中的应用[D]. 广西师范大学, 2018. 祝学英. 自主合作探究学习方式下初中阅读教学的实践与研究[D]. 东北师范大学, 2010. 周艳. 高中英语新手和专家型教师师生互动话语支架比较研究[D]. 宁波大学, 2018. 赵英芳. 促进学生创新思维发展的教学策略研究[D]. 上海师范大学, 2006. Ahlam, ES, and Hala Gaber. "Impact of Problembased Learning on Student's Critical Thinking Dispositions, Knowledge Acquisition and Retention." Journal of Education and Practice 5, no. 14 (2014): 74-83. Bell, Thorsten, Detlef Urhahne, Sascha Schanze, and Rolf Ploetzner. "Collaborative Inquiry Learning: Models, Tools, and Challenges." International journal of science education 32, no. 3 (2010): 349-77. Cioffi, Jane Marie. "Collaborative Care: Using Six Thinking Hats for Decision Making." International journal of nursing practice 23, no. 6 (2017): e12593. Edel-Malizia S, Brautigam B, Bittner K, et al. Investigating Interactive Video Assessment Tools for the Flipped Classroom[J]. 2015. Ge, Xun, and Susan M Land. "Scaffolding Students’ Problem-Solving Processes in an Ill-Structured Task Using Question Prompts and Peer Interactions." Educational Technology Research and Development 51, no. 1 (2003): 21-38. Hmelo-Silver, Cindy E. "Problem-Based Learning: What and How Do Students Learn?". Educational psychology review 16, no. 3 (2004): 235-66. Iwaoka, Wayne T, Yong Li, and Walter Y Rhee. "Measuring Gains in Critical Thinking in Food Science and Human Nutrition Courses: The Cornell Critical Thinking Test, Problem-Based Learning Activities, and Student Journal Entries." Journal of Food Science Education 9, no. 3 (2010): 68-75. Jahanzad, Farzaneh. "Influence of the Deeper Scaffolding Framework on Problem-Solving Performance and Transfer of Knowledge." Oklahoma State University, 2012. Karadag, Mevlude, Serdar Saritas, and Ergin Erginer. "Using The'six Thinking Hats' Model of Learning in a Surgical Nursing Class: Sharing the Experience and Student Opinions." Australian Journal of Advanced Nursing, The 26, no. 3 (2009): 59. Kim, Nam Ju. "Enhancing Students’ Higher Order Thinking Skills through Computer-Based Scaffolding in Problem-Based Learning." (2017). Loparev, Anna. The Impact of Collaborative Scaffolding in Educational Video Games on the Collaborative Support Skills of Middle School Students. University of Rochester, 2016. Masek, Alias, and Sulaiman Yamin. "The Impact of Instructional Methods on Critical Thinking: A Comparison of Problem-Based Learning and Conventional Approach in Engineering Education." ISRN Education 2012 (2012). McLoughlin, Catherine. "Learner Support in Distance and Networked Learning Environments: Ten Dimensions for Successful Design." Distance Education 23, no. 2 (2002): 149-62. Mennin, S, P Gordan, G Majoor, and HA Osman. "Position Paper on Problem-Based Learning." Education for health (Abingdon, England) 16, no. 1 (2003): 98-113. Newmann, Fred M. "Higher Order Thinking in the Teaching of Social Studies: Connections between Theory and Practice." Informal reasoning and education (1991): 381-400. Ogden, Conswellor Denise. "Skype as a Scaffolding Tool for Underprepared Freshmen English Composition Students." (2015). Quinn, Margaret F, Hope K Gerde, and Gary E Bingham. "Help Me Where I Am: Scaffolding Writing in Preschool Classrooms." The Reading Teacher 70, no. 3 (2016): 353-57. Rosenshine, Barak, and Carla Meister. "The Use of Scaffolds for Teaching Higher-Level Cognitive Strategies." Educational leadership 49, no. 7 (1992): 26-33. Savery, John R. "Overview of Problem-Based Learning: Definitions and Distinctions." Essential readings in problem-based learning: Exploring and extending the legacy of Howard S. Barrows 9 (2015): 5-15. Saye, John W, and Thomas Brush. "Scaffolding Critical Reasoning About History and Social Issues in Multimedia-Supported Learning Environments." Educational Technology Research and Development 50, no. 3 (2002): 77-96. Schellens, Tammy, Hilde Van Keer, Bram De Wever, and Martin Valcke. "Tagging Thinking Types in Asynchronous Discussion Groups: Effects on Critical Thinking." Interactive Learning Environments 17, no. 1 (2009): 77-94. Schmidt, Henk G. "Foundations of Problem‐Based Learning: Some Explanatory Notes." Medical education 27, no. 5 (1993): 422-32. Semerci, Nuriye. "The Effect of Problem-Based Learning on the Critical Thinking of Students in the Intellectual and Ethical Development Unit." Social Behavior and Personality: an international journal 34, no. 9 (2006): 1127-36. Şendağ, Serkan, and H Ferhan Odabaşı. "Effects of an Online Problem Based Learning Course on Content Knowledge Acquisition and Critical Thinking Skills." Computers & Education 53, no. 1 (2009): 132-41. Smith, Mike, and Kathryn Cook. "Attendance and Achievement in Problem-Based Learning: The Value of Scaffolding." Interdisciplinary Journal of Problem-based Learning 6, no. 1 (2012): 8. Topping, Keith J. "Trends in Peer Learning." Educational psychology 25, no. 6 (2005): 631-45. Van de Pol, Janneke, Monique Volman, and Jos Beishuizen. "Scaffolding in Teacher–Student Interaction: A Decade of Research." Educational psychology review 22, no. 3 (2010): 271-96. Wass, Rob, Tony Harland, and Alison Mercer. "Scaffolding Critical Thinking in the Zone of Proximal Development." Higher Education Research & Development 30, no. 3 (2011): 317-28. Wood, David, Jerome S Bruner, and Gail Ross. "The Role of Tutoring in Problem Solving." Journal of child psychology and psychiatry 17, no. 2 (1976): 89-100. Yew, Elaine HJ, and Karen Goh. "Problem-Based Learning: An Overview of Its Process and Impact on Learning." Health Professions Education 2, no. 2 (2016): 75-79. Ziadat, Ayed H, and Mohammad T Al Ziyadat. "The Effectiveness of Training Program Based on the Six Hats Model in Developing Creative Thinking Skills and Academic Achievements in the Arabic Language Course for Gifted and Talented Jordanian Students." International Education Studies 9, no. 6 (2016): 150. ﹀
公开日期：	2019-06-04

自适应英语写作系统社交模块的设计与实践.陈陟

链接

题名：	自适应英语写作系统社交模块的设计与实践
姓名：	陈陟
学号：	1601210476
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	北京交通大学
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2019-05-27
外文题名：	Adaptive English Writing System Social Module Design And Practice
关键词：	大学英语写作过程写作教学协作学习学习动机结构化研讨
外文关键词：	English writing teaching Collaborative learning theory Motivation theory Structured research method
论文摘要：	︿随着全球化的深入，英语写作变得愈发重要。学习者对于提升写作能力的诉求也越来越强烈，由于传统写作教学对写作成果的重视程度远大于其写作过程，写作能力的提升也并非一蹴而就，因此写作成为了众多学生英语学习的弱项。近年来随着在线学习的普及以及过程写作理论、协作学习等理论的推广，越来越多的学者开始重视写作过程，提出了群组讨论和同伴互评等协作学习方式，并将其应用到各类在线英语写作教学和英语写作学习系统中。然而，现有的在线英语写作教学、学习系统并未从英语写作学习的实践出发，群组讨论阶段存在讨论跑题、积极性不足等问题；同伴反馈阶段存在互评质量差，参与意愿低等问题，学生的写作能力提升缓慢。本文以二语写作理论、协作学习理论、激励理论及结构化研讨方法为依据，分析了现有写作教学模式和竞品在社交化模块上的优势与不足。针对其中存在的问题以及大学生群体的写作需求，结合社交化模块的评价标准，对自适应英语写作系统的社交化模块进行了设计。通过建立结构化的讨论社区，提高讨论的质量，培养学生的思维能力，降低学生在写作过程当中的无助感和焦虑感。通过建立结构化的同伴互评方式，帮助学生建立批改思路，获得多元化、高价值的反馈意见。通过奖励等机制的设计，提升学生在学习过程中的参与感和满足感，激发学生的参与意愿。由于系统中涉及的社交化机制较多，本文在此不一一详述，拣选两个具有代表性的功能——创意广场和同伴互评进行了研究和探讨，并针对这两个功能提出了不同的设计方案。此外还对系统中其他模块的设计进行了简单的介绍。基于以上设计，本研究选取了南京某高校20名非英语专业学生进行了教学实验。通过实验观察、数据分析、深度访谈、调查问卷等方式，论证了创意广场和同伴互评设计在提高群组讨论质量，降低学生无助感和焦虑感，培养学生思维能力，提升讨论积极性以及在提高同伴互评的质量，提升学生评判性思维能力、认知能力以及参与意愿等方面大有裨益，并筛选出了这两个功能的最优方案。本研究中自适应英语写作系统的社交化设计弥补了在线英语写作协作化学习方面的不足。在群组讨论阶段，讨论质量及学生们的思维能力均有所提高，缓解了学生写作时的焦虑和畏难情绪。在同伴互评阶段，提升了互评的质量，满足了学生对于多元化，高价值反馈的需求，提升了学生的能力，激发其参与反馈的意愿，补足了现有英语写作协作化学习设计的短板，对英语写作移动教学和协作化学习有一定的参考价值。﹀
外文摘要：	︿ As globalization continues to develop, English writing becomes all the more important, which leads to stronger desire of English learners to improve their writing skills. Traditional teaching of writing, however, emphasizes results over processes, leaving writing skills that cannot be cultivated overnight a weak link for college students. In recent years, with online learning, the process theory of composition and the collaborative learning theory gaining popular, more scholars than before pay attention to the writing process. They put forward collaborative learning approaches such as group discussion and peer review, and also apply these approaches on various online English writing systems. However, existing online English writing systems are not practical. Problems occur such as off-the-topic or poorly motivated discussions, low quality of peer evaluation and low willingness to participate. All these lead to slowly improved writing ability. Based on L2 writing theory, collaborative learning theory, motivation theory and structured research method, this thesis analyzed common teaching methods of English writing and current teaching products. The author applied standards and contents of the collaborative learning methods and then designed systemic social modules targeting the problems mentioned above and needs of college students for writing. This thesis hopes to improve the quality of discussions and train students' critical thinking via structured discussion groups so as to reduce their helplessness and anxiety in the writing process. Structured peer review and intelligent recommendation of reviewing peers can help students set up the correction ideas and get diversified and valuable feedback. Besides, with rewards and other systems, students’ would be more satisfied and motivated to participate in the discussion. As there are loads of social mechanisms involved in the system, this thesis was not able to cover all of them and thus selected two core functions for research, namely the creative square and peer review. This thesis includes different design proposals and brief introduction of ideas of designing other modules in the system. In this thesis, 20 non-English majors from a university in Nanjing were invited in the experiments. Through experimental observation, data analysis, in-depth interview and questionnaires, the author proves creative square and peer review can improve the quality of group discussions and reduce students’ helplessness and anxiety. On top of that, they also help to cultivate students’ critical thinking, motivate students to discuss and promote the quality of peer review and students’ willingness to participate. In the thesis, the best solution of each function was also presented. In this study, the social design of the adaptive English writing system optimized the process of collaborative learning, improved the quality of discussion, developed the students' critical thinking and relieved the anxiety in the writing process. In the post-writing stage, the quality of mutual evaluation was improved to meet the needs of students to get diversified and valuable feedback. This can inspire students' willingness to provide feedback and make up the weak link of online English teaching of writing, which has certain reference value for the mobile teaching of English writing and cooperative learning. ﹀
分类号：	TP3
论文总页数：	52
参考文献总数：	53
参考文献列表：	︿毕劲，秦晓晴等. 2014. 国外英语学术写作研究趋势及其启示[J].外语教学，35(2)：45-48. 陈春梅. 2012. 限行六中英语写作教学理论比较分析[J].湖南科技学院学报，33(9)：150-152. 蔡宁伟，于慧萍. 2015. 参与式观察与非参与式观察在案例研究中的应用[J].管理学刊，28(4)：66-69. 邓鹂鸣，刘红，陈艳等.2004.过程写作法在大学英语写作教学实验中的运用[J].外语教学，25(6)：69-72. 代碧薇. 2017. 基于wiki的小组协作式翻译教学研究[D]. 北京大学. 巩潇宁. 2016. ADCS动机模型的应用探究—以小学语文课堂为例[D]. 上海师范大学. 龚晓斌. 2007. 英语写作教学：优化的同伴反馈[J].国外外语教学，(3)：47-51. 郭晓英，王宝峰. 2009. 基于网络博客的大学英语写作模式，11(3)：1-5. 郭燕，樊葳葳. 2009. 大学英语分层次教学背景下的写作焦虑实证研究，(10)：79-84. 郭有松，谭良. 2017. 移动协作学习的个性化分组策略研究[J].中国远程教育，(8)：21-28. 贺学贵.2010.“过程写作法”在英语写作教学中的应用[J].黄冈师范学院学报，(2)：89-91. 黄渐法.2017.基于小组合作的英语写作教学探索[J].西部素质教育，3(22)：223-224. 侯彩静，苏鹏等. 2017. 同伴评价在大学英语过程写作教学中的应用[J]. 大学教育，(1)：105-106. 侯彩静，苏鹏等. 2017. 同伴评价对英语写作能力培养的影响[J]. 山西大同大学学报，29(5)： 98-99. 何芳，王伟等.2013基于网络的英语学习研究[M].知识产权出版社蒋云华，罗乐.2014.基于大学英语写作教学中的批判性思维培养[J].校园英语(中旬)，(9)：4-5. 康霞.2017.英语写作教学理论与实践研究[M].北京邮电大学出版社. 连秀萍. 2012. 合作学习对大学生英语写作的影响[J]. 西南农业大学学报(社会科学版)，10(8)：198-202. 兰良平，韩刚.2014.英语写作教学：课堂互动性交流视角[M].外语教学与研究出版社. 刘黄玲子，黄荣怀.2002.协作学习评价方法[ J].现代教育技术.(1):24-29,76 刘凤娇. 2011. 激励性评价策略下“写长法”在高中英语写作教学中的应用研究[D]. 山东师范大学. 楼荷英，陈阳明等. 2008. 运用网上辅导和师生论坛的写作教学研究[J].Foreign Language World，(4)：41-47. 刘志强. 2007. 英语写作教学中创造性思维的培养[J].黑龙江教育学院学报，26(7)：153-154. 刘芳琼. 2007. 大学英语学习中的心理障碍分析与对策[J].南京师范高等专科学校学报，24(2)：90-93. 梁茜.2003.切实有效的实施写作过程教学法—英语写作教学模式新探[J].成都师范学院学报，19(9):53-55 倪清泉. 2009. 网络环境下基于写作学习的大学英语写作教学研究[J].外语电化教学，(127)：63-68. 曲巍巍. 2016. 基于自动评分系统的协作式大学英语写作教学实证研究 [J].亚太教育，(32)：110-111. 涂志云.2011.激励策略在大学英语词汇教学中的应用[J].文教研究，(2) 吴育红，顾卫星. 2011.合作学习降低非英语专业大学生英语写作焦虑的实证研究[J]. 外语与外语教学，(6)：51-55. 王晓芳. 2012. 基于微博的大学英语协作式写作教学研究[D]. 河南师范大学. 王永红. 2007. 同伴在线作文互评浅析[J]. 中国水运，5(12)：252-253. 王世卿，韩春雷. 2015. 结构化研讨法在研究生教学中的应用研究—以“和谐警民关系”专题为例[J]. 中国人民公安大学学报(自然科学版)，(4)：89-92. 韦储学. 2008. 建构主义理论及其对大学英语写作教学的启示[J].高教论坛，(3)：90-92. 伍新春，管琳. 2010 .合作学习与课堂教学[ M].人民教育出版社. 于夕真. 2007. 写作认知心理过程的研究与博客大学英语写作教学[J].外语电化教育，(116)：74-78. 赵建华,李克东.2010.协作学习及其协作学习模式[J].中国电化教育，(10)：5-6. 张会萍. 2017. 网上同伴互评在英语写作能力发展中的积极作用[J]. English Teachers，17(22)：27-30. 周昕. 2013. 开展结构化讨论，探索党校教学改革新路径[J]. 长江论坛，(6)：92-95. 赵翊君，杨跃. 2013. 博客辅助英语专业过程写作教学模式的实证研究，(153)：46-51. 周红. 2007. 第二语言写作教学理论研究动态[J].云南师范大学学报，5(6)：27-32 Bush J, Zuidema L. 2013. Professional Writing in the English Classroom: Professional Collaborative Writing--Teaching, Writing, and Learning--Together.[J]. English Journal.(102) Cheng, Y.2004. A Measure of Second Language Writing Anxiety: Scale Development and Preliminary Validation [J]. Journal of Second Language Writin.13(4):313-335. Castro A A. 1999. Structuring the discussion of scientific papers. Randomised controlled trial of structured discussions is needed[J]. Bmj, 319(7209):581. Dillenbours.2000.Collaborative learning: cognitive and computational approaches[J]. Computers & Education,35(1):83-86. Grabe, W., & Kaplan, R. B.1996. Theory and Practice of Writing: An Applied Linguistic Perspective[J]. Applied Linguistics and Language Study London: Longman Harrer A. 2004. Analysis and Intelligent Support of Learning Communities in Semi-Structured Discussion Environments[M].Artificial Intelligence Applications and Innovations Kepner,C.G.1991.An experiment in the relationship of types of written feedback to the development of second language writing skills[J]. The Modern Language Journal,(75):305-313 Grabe, W., & Kaplan, R. B.1996. Theory and Practice of Writing: An Applied Linguistic Schaffert,S etc.2006 Learning with Semantic Wikis[C].Proceedings of the First Workshop on Semantic Wikis-From Wiki To Semantics:109-123 Swain,M.1995. Three Functions of Output in Second Language Learning [A].// G.Cook & B. Seidlhofer (Eds.) Principle and Practice in Applied Linguistics [C].Oxford : Oxford University Press Suwantarathip O , Wichadee S. 2014. The Effects of Collaborative Writing Activity Using Google Docs on Students' Writing Abilities[J]. Turkish Online Journal of Educational.13(2):148-156. Seyyedrezaie Z S , Ghonsooly B , Shahriari H , et al. 2016. A MIXED METHODS ANALYSIS OF THE EFFECT OF GOOGLE DOCS ENVIRONMENT ON EFL LEARNERS’ WRITING PERFORMANCE AND CAUSAL ATTRIBUTIONS FOR SUCCESS AND FAILURE[J]. Turkish Online Journal of Distance Education, 17(3). Zhou W , Simpson E , Domizi D P. 2012. Google Docs in an Out-of-Class Collaborative Writing Activity[J]. International Journal of Teaching & Learning in Higher Education, 24(3):359-375. ﹀
公开日期：	2019-06-05

面向考试应用的托福积极词汇学习微信小程序的设计.黄郭钰慧

链接

题名：	面向考试应用的托福积极词汇学习微信小程序的设计
作者：	黄郭钰慧
学号：	1601210558
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	朱源
导师单位：	软件与微电子学院
第二导师姓名：	张宏岩
第二导师单位：	软件与微电子学院
答辩日期：	2019-05-27
题目(外文)：	The Design of a Wechat Applet on Active Vocabulary for the TOEFL Examination
关键字(中文)：	托福考试积极词汇微信小程序二语词汇习得
关键字(外文)：	TOEFL Active vocabulary Wechat Applet Second language vocabulary acquisition
文摘：	︿与普通英语词汇学习相比，托福词汇学习的特征主要表现为学科性强、学习量大和准备周期短。部分学习者对托福词汇学习存在误解，认为只需了解词汇意思即可，但根据托福考试要求，其中一部分核心高频词需要被转换为积极词汇，即需要在听力、写作和口语中熟练运用。调研表明，已有的背托福词汇APP或微信小程序并不注重积极词汇的训练，使得学习者对词汇的理解仅停留在“阅读”层面。本研究始于需求调研，以托福词汇教学理论、积极词汇学习理论、第二语言词汇习得理论为基础，以提高学习者学习效率和学习主动性为目的，设计了一款侧重于托福积极词汇学习的微信小程序。与常规词汇学习APP相比，尝试实现了以下改进：首先，在提高学习者学习效率层面：梳理托福常考学科和意群，通过词以类记的方式促进学习者根据学科和意群进行分类记忆；通过词频分析和归纳，统计出托福考试核心高频词作为积极词汇，包括：听力高频词、写作高频词、口语高频词，供学习者学习；利用语块输入、语境输入、联想输入和多感官输入，促进学习者的积极词汇学习输入；结合托福考试的具体考核方式，定制习题内容，促进学习者的积极词汇学习输出。其次，在提高学习者学习主动性层面：结合ARCS模型和认知负荷理论，在界面设计上从“注意力”、“相关性”、“自信心”三个方面提高学习者学习兴趣；并利用微信小程序可与微信群密切结合的独特优势，通过积分鼓励制和陪伴学习制增强学习者的粘性和活力。为验证本设计有效性，武汉某大学55名学习者参与了为期3周的教学实验。通过前测（使用微信小程序学习之前）、后测（使用微信小程序学习之后）和延后测（后测完成一周之后）检测学习者的学习效率，并使用满意度调查问卷检验学习者的学习主动性（注意力、相关性、自信心、满意度）。实验结果表明：本研究设计的托福积极词汇学习微信小程序比现有的托福词汇微信小程序更能显著提高学习者的学习效率，但在提高学习主动性方面还需更多探索。﹀
文摘（外文）：	︿ Compared with general English vocabulary building, TOEFL vocabulary building is characterized by involving a variety of disciplines, huge workload and comparatively less time for preparation. Some students have misconception on TOEFL vocabulary building, by false believing that to know the Chinese meaning of the words would be sufficient. Nevertheless, according to the test requirements for TOEFL, students should have the capacity to understand some TOEFL words in listening, writing and speaking tests. Survey shows most apps or Wechat Applets cannot help students to effectively build their active vocabulary and vocabulary building is meant for reading purpose only. Initiating from a requirement survey, this study applies the following theories regarding TOEFL Vocabulary Coaching, Active Vocabulary Building and the Second Language Acquisition in the design of this TOEFL Active Vocabulary building Wechat Applet to improve the learners’ learning efficiency and learning attitude. In comparison with existing alternatives, to begin with, this Wechat Applet attempts to improve learning efficiency in the following ways: 1. Classify TOEFL words according to different disciplines and meaning groups to enable memorization through classified words. 2. Generate a core TOEFL Active Vocabulary word list using a frequency analysis, including most frequently used words for listening, speaking and writing. 3. Combine the input of lexical chunks, the input of context, the input of association and the input of multi-sensory integration to improve students’ learning efficiency in terms of TOEFL Active Vocabulary input. 3. Tailor the testing methods according to the specific requirements of TOEFL examination to enhance the learners’ TOEFL Active Vocabulary output. Subsequently, the current Wechat Applet uses the ARCS model and the Cognitive load theory to improve students’ learning attitude in three clear areas: attention, relevance and confidence. This study also engages students through motivational systems of points-accumulation as well as the use of WeChat groups. An empirical experiment that lasted for three weeks was conducted in a Wuhan university. 55 first grade students tested the validity of the current Wechat Applet. The learning efficiency of the students were examined using three tests, namely, a pre-test(before using the Wechat Applet), post-test(after using the Wechat Applet) and delayed-test(one week after the post-test), whilst their learning attitudes were investigated through use of a satisfaction questionnaire. In conclusion, the Wechat Applet studied here can significantly improve students’ learning efficiency compared with existing apps in the market; nonetheless, further exploration should be made into improving students’ learning attitude. ﹀
分类号：	G43
论文总页数：	68
参考文献数：	78
参考文献：	︿白新国，刘清堂，徐宁.教育游戏中激励机制的分析与设计[C].教育技术国际论坛，教育技术的创新、发展与服务(下册)，武汉，2006：244-248. 陈巧芬.认知负荷理论记忆发展[J].现代教育技术，2007(9)：16-19. 池昌海.陈望道全集[M].浙江：浙江大学出版社，2011. 崔艳嫣.接受性词汇量、产出性词汇量和词汇研究深度知识的发展路径及其相关性研究[J].现代外语(季刊)，2006，29(4)：392-400. Diller，K.，顾诚.学习外语有最佳年龄吗?[J].国外外语教学，1982(1)：1-4. 段士平.国内二语语块教学研究述评[J].中国外语，2008(4)：63-39. 冯巨澜.提高大学习者英语学习中的积极词汇——关于积极词汇的实证研究[D].重庆：重庆大学，2005. 何华清.高中生英语高频词汇水平实证研究[J].中国外语教育，2016，9(4)：53-59. 花蓉.外语学习中的词汇记忆问题及对策[J].科教导刊，2015(8)：148-150. 类兴艳.书面输出任务对非英语专业学习者产出性词汇能力的影响研究[D].南昌：江西师范大学，2016. 李健民.英语词汇的多维研究[M].北京：光明日报出版社，2012. 李青，李莹.移动学习应用中积分激励机制设计[J].北京邮电大学学报(社会科学版)，2018，20(3)： 104-113. 刘锋.英语词汇移动学习记忆管理软件的设计与实现[D].天津：天津师范大学，2014. 刘淑君，杨仲敏，李长均.词汇记忆类软件对宁夏大学英语专业学习者词汇记忆的影响[J].时代教育，2017(9)：208. 卢术娟.词汇形式和意义的强化输入顺序对大学习者英语产出性词汇知识习得的影响[D].成都：四川外国语大学，2017. 陆宇佳.托福阅读理解能力的语言因素影响研究[D].成都：西南交通大学，2018. 卢敏.产出性词汇知识广度的发展特征——基于英语专业学习者书面语的研究[J].外语教学理论与实践，2008(2)：10-15. 马晓楠.思维导图在高中英语词汇教学中的应用实证研究[D].长沙：湖南师范大学，2017. 蒙台梭利著，胡纯玉译.发现孩子[M].北京：中国发展出版社，2006. 牛瑞英.合作输出中的任务角色对二语词汇习得作用的一项实验研究[J].山东外语教学，2010(4)：3-9. Olivier & Bowler.丁凡译.多感官学习[M].台北：源流出版社，2000. 潘翠翠，何颖，丁珊珊，王艺，李坤.基于APP的英语词汇记忆实证研究[J].海外英语.2017(10)：73-79. 荣岩.医学英语词汇学习系统研究与设计[D].北京：北京大学，2019. 田蕾.新托福与高考英语测试全国卷真实性对比研究[J].外语教育教学，2015(12)：157-160. 童淑华.第二语言产出性词汇习得研究[M].吉林大学出版社，2010：21-22. 王海棠.高中英语词汇教学策略探究[J].英语教师，2017(19)：66-69. 王清，张必兰.基于增强现实的安卓英语词汇识记软件的设计与实现[J].电脑知识与技术，2014，10(27)：31-35. 徐春，章晓辉.学习和记忆的突触模型：长时程突触可塑性[J].自然杂志，2009，31(3)：136-141. 徐亮.基于自适应学习模式的大学英语产出性词汇教学研究[D].北京：北京大学，2015. 杨进中.认知负荷理论视角的移动课程教学设计[J].现代远程教育研究，2012(3)：86-90. 岳颖莱.究竟是“附带习得”还是“附带学得”[J].新课程学习，2010(4)：20-21. 张萍.二语词汇习得研究：十年回溯与展望[J].外语与外语教学，2006(6)：21-26. 张若男.词汇记忆APP对于初中英语学习效果提升的探索研究[D]上海：上海师范大学，2018. 赵瑞芬.多感官学习的研究现状与展望[J].生物技术世界，2016(4)：286-287. 郑瑞珺.怎样以图式理论指导改善大学习者托福IBT听力教学[J].文教资料，2018(3)：225-226. 周小华.基于核心素养下语境教学法在英语词汇教学中的运用[C].中国会议，教育理论研究阅读教学，2019. 周远清.深化教学改革,提高教学质量[N].中国教育报，2006(12). Bandura, A. On the functional properties of perceived self-efficacy revisited[J]. Journal of Management, 2012a, 38(1): 9-44. Bandura, A. Self-efficacy: The exercise of control[M]. New York: Freeman, 1997. Bandura, A. Social foundations of thought and action: A social cognitive theory[M]. Englewood Cliffs, NJ: Prentice-Hall, 1986. Becker, J. The Phrasal Lexicon[M]. Cambridge Mass: Bole and Newman, 1975. Brown C., Payne M. E. Five essential steps of processes in vocabulary learning [C]. Paper presented at the TESOL convention, Baltimore. 1994. Brown, J. I. Reading improvement through vocabulary development: The CPD Formula[C]. In New Frontiers in College-AdtReaditteeath Yearbook of the National Reading Conference: 197-202. Choi, S., & Clark, R. E. Cognitive and affective benefits of an animated pedagogical agent for learning English as a second language[J]. Journal of Educational Computing Research, 2006, 34(4), 441-466. Colakoglu, O., Akdemir, O. Motivational measure of the instruction compared instruction based on the ARCS motivation theory vs traditional instruction in blended courses[J]. Turkish Online Journal of Distance Education, 2010, 11(1): 73-89. Ebbinghaus, H. Memory: A Contribution to Experimental Psychology[M]. New York: Columbia University, 1913: 30-89. Garris, R., Ahlers, R. & Driskell, J. Games. motivation, and learning: a research and practice model[J]. Simulation and Gaming, 2002, 33(4): 441-467. Ghaffari, M., & Mohamadi, R. The effect of context (humorous vs. Non-humorous) on vocabulary acquisition and retention of Iranian EFL learners[J]. International Journal of Applied Linguistics and English Literature, 2012, 1(6), 222-231. Halliday, M. A. K. R. Hasan. Language, Context and Text: aspect of language in a social semiotic perspective[M]. Oxford: Oxford University Press, 1985. Herodotou C., Winters N., Kambouri M. An iterative, multidisciplinary approach to studying digital play motivation: the model of game motivation[J]. Games and Culture: a journal of Interactive Media, 2015, 10(3): 1-20. Keller. J. M. First principles of motivation to learn and e3-learning[J]. Distance Education, 2008, 29(2), 175-185. Krashen S. D. The input hypothesis: issues and implications[M]. Addison-Wesley Longman Ltd, 1985. Krashen S. D. & T. D. Terrell. The natural approach: language acquisition in the classroom[M]. Oxford: Pergamon, 1983. Kwan, K. N., Ching, H. L., Wai, M. L. The impact of social mobile application on students’ learning interest and academic performance in Hong Kong’s sub-degree education[C]. 2016 International Symposium onEducational Technology(ISET), 2016, 18-22. Laufer, B. The development of passive and active vocabulary in second language: Same or different?[J]. Applied Linguistics, 1998, 19: 255-271. Laufer, B. ‘Sequence’ and ‘Order’ in the development of L2 lexis[J]. Some Evidence from Lexical Confusion. Applied Linguistics, 1990, 11(3): 281-296. Laufer, B. & Hulstijin, J. Incidental vocabulary acquisition in a second language: The construct of task-induced involvement[J]. Applied linguistics, 2011, 22(1): 1-26. Laufer, B. & Nation, P. A vocabulary-size test of controlled productive ability[J]. Language testing, 1999. 16(1): 33-51. Lawson, M. J., & Hogben, D. The vocabulary- learning strategies of foreign- language students[J]. Language learning, 1996, 46(1): 101-135. Lewis, M. The lexical approach[M]. Hove, England: Language Teaching Production, 1993. Liu, Z. W. A study on the application of Wechat in training[J]. Theory and Practice in Language Studies, 2014, 4(12): 2549-2554. Loorbach, N., Karreman, J. & Steehouder, M. Adding motivational elements to an instruction manuals: effects on usability and motivation[J]. Applied Research, 2007, 54(3): 343-358. McCarthy M. Discourse Analysis for Language Teachers[M]. Cambridge University Press, 1991. McCarthy, M., & Wigglesworth, G. Vocabulary teaching and learning special issue[J]. Prospect Journal, 2001. 16(3). Meara, P. Vocabulary acquisition: A neglected aspect of language learning. Language Teaching and Linguistic[J]. 1980, 13(4): 221-246. Nagy, W. E., & Herman, P. A. Incidental vs. instructional approaches to increasing reading vocabulary[J]. Educational perspectives, 2003, 23(1): 16-21. Paas, F., Renkl, A. & Sweller, J. Cognitive load theory: Instructional implications of the interaction between information structures and cognitive architecture[J]. Instructional Science, 2004, 32: 1-8. Parry, K. Second language vocabulary acquisition: A rationale for pedagogy[J]. Vocabulary and comprehension, 1997: 55. Schwabe G., Goth C. Mobile learning with a mobile game: design and motivational effects[J]. Journal of Computer-Assisted Learning. 2005, 21(3): 204-216. Shi, Z. J., Luo, G. F. Application of Wechat teaching platform in interactive translation teaching[C]. International Journal of Emerging Technologies in Learning, 2014, 9(9): 71-75. Sokmen A. J. Word association results: a window to the lexicon of ESL students[J]. JALT Journal, 1993, 15(2): 135-150. Sperber, D. & Wilson, D. Relevance: Communication and cognition[M]. Oxford: Blackwell Publishers Ltd, 1989. Stockwell, G. Vocabulary on the move: Investigating and intelligent mobile phone-based vocabulary tutor[J]. Computer Assisted Language Learning, 2007, 20(4): 3. Swain, M & Sharon, L. Problems in output and the cognitive process they generate: a step to language learning[J]. Applied Linguistic, 1995, 16(3): 371-393. Van der Meij, H., van der Meij, J., & Harmsen, R. (2015). Animated pedagogical agents effects on enhancing student motivation and learning in a science inquiry learning environment[J]. Education Tech Research, 2015, 63(3), 381-403. Van Merrienboer, J. J. G. & Kirschner, P. A. Ten Steps to Complex Learning: A Systematic Approach to Four Component Instructional Design[M]. Mahwah, NJ: Lawrence Erlbaum Associoctes, 2007. Wesche, Marjorie, and T. Sima Paribakht. Assessing second language vocabulary knowledge: depth versus breadth[J]. Canadian Modern Language Review, 1996, 53(1): 13-40. Wigfield, A., & Eccles, J. S. Expectancy-value theory of achievement motivation[J]. Contemporary Educational Psychology, 2000. 25(1), 68-81. ﹀
公开日期：	2022-06-03

出版审校流程中专业审校与目标读者审校的对比研究——以《培养小极客》为例.张心彧

链接

题名：	出版审校流程中专业审校与目标读者审校的对比研究——以《培养小极客》为例
姓名：	张心彧
学号：	1601210866
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-27
外文题名：	Comparative Study of Editor's Review and Target Reader's Review During Publishing Process—A Case Study of Bringing Up Geeks
关键词：	出版审校流程专业审校目标读者审校
外文关键词：	Review and Publishing Process Editor’s Review Target Reader’s Review
论文摘要：	︿目前，国内翻译研究多集中于译者、译作和翻译策略等方面，审校相关研究相对较少，而对中文版图书的出版审校研究则更为罕见。笔者在翻译《培养小极客》（Bringing Up Geeks）一书的过程中，深入了解了出版社的审校流程，并发现目前部分出版社在出版中文版图书时，会在原有的内部专业审校基础上，增加目标读者审校的环节，从而弥补专业审校的不足，提高中文版图书编校质量，提升读者对中文版图书的阅读体验。因此，本文基于对《培养小极客》的翻译和审校过程中出现的具体案例，以目标读者审校为研究对象，以明确目标读者审校的价值为研究目的，首先通过理论与现实结合的方法对审校者多样化与读者“反馈提前化”的可能性进行了论证，并从理论角度分析了目标读者在审校中可能起到的作用，从而明确了目标读者参与中文版图书审校的可行性；其次从审校专业能力、审校目的、审校标准或规范和审校方式四个维度对专业编校人员和目标读者进行对比分析，并通过对《培养小极客》中具体案例的分析，详细探讨了目标读者审校在句义不明、词义不明、文化差异和表达优化四类问题上所发挥的作用；最后结合对专业审校局限性的分析，明确了目标读者审校的作用，即（1）帮助解决专业审校忽视的语义不明问题，增强译本的可理解性；（2）润色文本表达，提升译本的可读性；（3）帮助解决专业审校忽视的文化差异问题，并提出对应的多样化解决方案，提升译本的丰富性。本文证明，在中文版图书出版过程中，目标读者审校是对专业审校的一个有效补充，将其纳入出版社的审校流程，在弥补专业审校的局限性和盲点、提高图书编校质量、增强图书对读者的吸引力和说服力等方面均有一定意义。﹀
外文摘要：	︿ Currently, domestic translation studies mostly focus on translators, translation works, and translation strategies, while researches on review and publishing process are relatively rare. After finishing the translation of Bringing Up Geeks written by American writer Marybeth Hicks, the author of this paper pays more attention to the review and publishing process and finds that the publisher’s review contains not only internal editor’s review but also external target reader’s review. The latter is to make up for the limitations and shed light on the blind spots of the editor’s review, improve the quality of translation, as well as enhance readers’ reading experience. Combining qualitative analysis with quantitative analysis, this paper analyzes the differences between editor’s review and target reader’s review and systematically clarifies the functions of target reader’s review based on the specific problems occurring during the review and publishing process of Bringing Up Geeks. The analysis shows that the editor’s review and the target reader’s review differ in skills, purposes, standards and methods. Also, the research finds that the target reader’s review can complement the editor’s review in four aspects occurring in the translation: ambiguity of sentences, ambiguity of words, cultural differences and expression optimization, even though the target reader may present wrong or unnecessary suggestions due to subjective factors. In conclusion, this paper advises that translation revision is a critical process in the publication of translated books. Target reader’s review, as a relatively new mode of review among domestic publishers, presents interesting interactions between the translator, editor and reader. It is an effective supplement to the editor’s review and therefore deserves more attention from both publishers and researchers. ﹀
分类号：	G23
论文总页数：	198
参考文献总数：	57
参考文献列表：	︿ [1] 陈玉姣. 引进版图书的翻译现状和对策[j]. 商情, 2017(30):252. [2] 2017年引进版权汇总表. 中华人民共和国国家版权局[eb/ol]. (2018-10-10)[2019-03-21]. http://www.ncac.gov.cn/chinacopyright/contents/11228/386858.html. [3] 苏秋丽. 开卷数据\|近年国内引进版图书市场分析[eb/ol]. (2016-08-15)[2019-03-21]. https://mp.weixin.qq.com/s/svqdsfzxpdm_bqjka1vqpq. [4] 闫明. 引进版图书的翻译现状与对策探讨[j]. 学园, 2014(19):12-13. [5] 李新妞. 如何提升引进版图书的编校质量——以经管类书稿为例[j]. 传播力研究, 2018, 2(28):160-161. [6] 刘苏华. 出版社编校质量控制模式构建[j]. 现代出版, 2013(2):57-60. [7] 崔庆喜. 关键在于把“三审三校制”落到实处[j]. 中国出版, 1995(11):17-17. [8] 李满意. 浅析图书编校质量问题成因和预警管理[j]. 中国编辑, 2017(05):47-52. [9] 黄健, 王丹. 封面差错面面观:以医学及相关图书为例[j]. 出版广角, 2016(13):50-52. [10] 尹玉吉. 中西方学术期刊审稿制度比较研究[j]. 浙江大学学报(人文社会科学版), 2012, 42(4):201-216. [11] 郭力伟. 如何提高引进版图书的编校质量[j]. 新媒体研究, 2017, 3(1):92-93. [12] 赵玉山, 程晶晶. 出版人职业生存现状调查样本报告(2017—2018年度)[j]. 科技与出版, 2018(10). [13] 刘澍. 警惕编辑工作中的心理干扰[j]. 编辑学刊, 2006(6):29-31. [14] 陶范. 析编辑的偏见[j]. 出版发行研究, 2005(12):32-35. [15] 程静华, 苏克玉, 宁学才,等. 校对规律的研究现状及思考[j]. 中国科技期刊研究, 2003, 14(3):245-247. [16] 杨娟林. 论心理因素对校对工作的影响[j]. 科学之友：上, 2007(2b):91-92. [17] 张锋. 出版校对心理学研究[j]. 编辑学刊, 1997(6):30-36. [18] 赵桂树. 校对工作的心理干扰及其排除[j]. 出版发行研究, 1999(3):23-25. [19] moustafa k. is there bias in editorial choice? yes[j]. scientometrics, 2015, 105(3):2249-2251. [20] adin r. dealing with editor’s bias[eb/ol]. (2015-01-14)[2019-03-25]. https://americaneditor.wordpress.com/2015/01/14/dealing-with-editors-bias/. [21] weller a. potential bias in editorial peer review[j]. the serials librarian, 1991, 19(3-4):95-103. [22] gilliland s, cortina j. reviewer and editor decision making in the journal review process[j]. personnel psychology, 1997, 50(2):26. [23] 孙会香. 如何提高图书编校质量[j]. 出版参考, 2011(21):26-26. [24] 林瑞耕. 科技图书编辑手册[m]. 北京：中国铁道出版社, 2004. [25] 严安. 读者是编辑工作的核心——浅谈编辑的起源及如何做好新时期编辑工作[j]. 学术论坛, 2010, 33(11):172-174. [26] 李曙豪. 论编辑活动中“隐含的读者”[j]. 编辑之友, 2002(6). [27] 钟天明. 编辑读者关系之我见[j]. 出版发行研究, 1991(2):36-38. [28] 阙道隆. 书籍编辑学概论[m]. 沈阳：辽宁教育出版社, 1995. [29] 牛正攀. 图书读者反馈机制构建研究[d]. 开封：河南大学, 2010. [30] müller j, klemens h. movies of the 80s [m]. koln: taschen, 2003. [31] 王永宁. 校对人员要当好“第一读者”[j]. 传媒观察, 2008(4):61-61. [32] carney k m. the publisher's reader as feminist: the career of geraldine endsor jewsbury[j]. victorian periodicals review, 1996, 29(2):146-158. [33] what are beta readers and sensitivity readers?[eb/ol]. (2019-01-18)[2019-03-27]. https://blog.reedsy.com/beta-readers-sensitivity-readers/. [34] mcmahon m. what is a beta reader?[eb/ol]. (2019-03-29)[2019-04-15]. https://www.wisegeek.com/what-is-a-beta-reader.htm. [35] mason e. this book is racist damaging rewritten[eb/ol]. (2018-03-19)[2019-04-01]. https://www.washingtonpost.com/graphics/2018/entertainment/books/keira-drake-the-continent-book-comparisons/?noredirect=on&utm_term=.0c0008cc86ba. [36] mason e. publishers are hiring “sensitivity readers” to flag potentially offensive content [eb/ol]. (2017-02-15)[2019-03-20]. https://www.chicagotribune.com/lifestyles/books/ct-publishers-hiring-book-readers-to-flag-sensitivity-20170215-story.html. [37] 邱晓伦. 浅谈翻译实践中审校译文的具体原则[j]. 语言与翻译, 2000(1):41-43. [38] 郑四方, 李征娅. 关联视角下的读者观照与翻译研究[j]. 学术探索, 2012(7):149-151. [39] 于利伟. 基于目标读者论的《傲慢与偏见》译本研究[j]. 语文建设, 2016(33):67-68. [40] 张美芳, 陈曦. 巧传信息适应读者——以故宫博物院网站材料翻译为例[j]. 中国翻译, 2013(4):99-103. [41] 李小川. 情态意义翻译的读者接受原则研究[j]. 外语教学, 2012(6):101-104. [42] mohammed f, mohammed a. reader responses in quranic translation[j]. perspectives, 2000, 8(1):27-46. [43] adomat d. handbook of research on children's and young adult literature (review)[j]. bookbird a journal of international children’s literature, 2011, 49(4):84-84. [44] 汉斯－赫尔穆特·勒林, 邓西录. 编辑的任务[j]. 中国编辑, 2003(1):83-84. [45] 图书编校规范简明手册[m]. 西安：西北大学出版社, 2013. [46] 赵崇岩. 读者阅读心理的研究[j]. 图书馆建设, 2000(4):92-94. [47] 石宗源. 图书质量管理规定[j]. 印刷质量与标准化, 2005(3):61-64. [48] 那欣. 技术进步语境下的黑马校对系统运用及其局限性[j]. 新闻传播, 2013(3):185-186. [49] 杨宇良. 极客是谁?[j]. 软件工程, 2008(12):51-52. [50] causer c. the geeks have inherited the earth[j]. ieee potentials, 2017, 36(4):8-10. [51] hicks m. bringing up geeks: how to protect your kid’s childhood in a grow-up-too-fast world[m]. berkley: berkley trade, 2008. [52] fields a m. the oxford english dictionary[m]. oxford: clarendon press, 1989. [53] forward s. toxic parents: overcoming their hurtful legacy and reclaiming your life[m]. new york: bantam books, 2002. [54] kulkarni g, et al. babytalk: understanding and generating simple image deions[j]. ieee transactions on pattern analysis & machine intelligence, 2013, 35(12):2891-2903. [55] wallis c. the multitasking generation[eb/ol]. (2006-03-27)[2019-03-01]. http://content.time.com/time/classroom/glenfall2006/pdfs/the_multitasking_generation.pdf. [56] 李国炎. 当代汉语词典[m]. 上海：上海辞书出版社, 2001. [57] 许宝华, 宫田一郎. 汉语方言大词典[m]. 北京：中华书局, 1999. ﹀
公开日期：	2019-06-04

京剧回译中的文化还原策略——以《伶界大王：1870-1937年京剧再造时期的演员与公众》为例.汪楚楠

链接

题名：	京剧回译中的文化还原策略——以《伶界大王：1870-1937年京剧再造时期的演员与公众》为例
姓名：	汪楚楠
学号：	1501210695
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-27
关键词：	京剧回译文化还原
论文摘要：	︿翻译不仅是语言的解码和编码过程，也是跨文化交际的过程。因而回译对原文的还原和回归，不仅是语言上的还原，也包含文化的还原。本文基于美国作者约书亚•葛以嘉（Joshua Goldstein）《伶界大王：1870-1937年京剧再造时期的演员与观众》（Drama Kings: Players and Publics in the Re-creation of Peking Opera，1870-1937）一书的翻译实践，探讨了原文中存在的文化英译问题，以及在回译时针对不同内容所采取的还原策略。在对回译研究、文化还原和京剧翻译研究做了简要回顾后，笔者首先分析作者在写作过程中对京剧文化的英译特点，探讨其带来的翻译难点。笔者发现，作者在英译京剧文化术语和专有名词时大量地使用音译和拼音注释。笔者以为这一翻译方式较好地保留了中国文化的异域特色，且相应的解释说明有助于读者了解文化概念的内涵，但同时作者的英译也存在不够准确或不够恰当的情况。其次笔者指出了原文术语模糊翻译、一词多义、人名音译错误和引用来源多元等现象给回译实践造成的难点。对于本书第二章已有的译文，笔者分析了其存在的问题，主要是词汇还原不准确、还原有误和引文未实现至译的情况。针对这些问题，笔者皆给出了自己的思考及认为更恰当的译法。接下来笔者总结归纳了在回译实践中针对不同情况所采取的还原策略，主要从对词汇和引文的还原两个角度出发。对词汇的还原方法有经过仔细详尽的查证后给出译法、省去不译原文中对于中文读者来说冗余的解释、通过添加文内注释或脚注的形式增添必要的解释说明或背景知识以增强译文的可读性。针对引文的还原，则主要依据是否找到引语原文和引语与原文的吻合程度来进行处理，主要分为按引文原文还原、添加注释说明作者错译现象、笔者自译等方法。﹀
外文摘要：	︿ Translation is not only a process of language decoding and encoding, but also a process of cross-cultural communication. In other words, translation is not only about bilingual transformation, but also about cultural exchange. Therefore, the restoration to the original text in the process of back-translation is not only of language but also of culture. Based on the translation project of Drama Kings: Players and Publics in the Re-creation of Peking Opera, 1870-1937 written by Joshua Goldstein, this paper probes into the translation problems of Peking Opera and the strategies adopted for different cultural contents in the process of restoration. This paper first summarizes the cultural content involved in the source text, which can be divided into two categories: vocabulary and citation. Then it analyzes the English translation of cultural content in the source text and points out the difficulties it poses to the back-translation, as they require the translator to handle carefully according to different circumstances. The paper then analyzes the problems in the Chinese translation of the second chapter of the source text, including inaccurate lexical restoration, incorrect restoration and imprecise back-translation of citations. In view of these problems, this paper puts forward appropriate translations. For the two categories of cultural content in the source text, the present paper proposes corresponding restoration strategies. Regarding vocabulary, there are three methods, namely, doing detailed research, omitting redundant information and annotating uncommon cultural concepts. As for citations, if sources can be found, copy the source; if not, translate them on one’s own. This may involve translating in the classical style of Chinese. In this case, Baidu’s classical-Chinese machine translation, still immature but being the only one of its kind, may be applied, with the result balanced by the translator. ﹀
分类号：	H08
论文总页数：	35
参考文献总数：	43
参考文献列表：	︿ [1] Colin Mackerras. Review[J]. China Review International, FALL 2007,14(2): 443-446. [2] Andrea Goldman. Reviews: Scholarship[J]. The Opera Quarterly, 2010,26(2-3):460-470. [3] 冯庆华,李美. 文体翻译论[M]. 上海:上海外语教育出版社, 2001. [4] 贺显斌.《回译的类型、特点与运用方法》[J]. 中国科技翻译, 2002(4):45-47. [5] Mark Shuttleworth, Moira Cowie. Dictionary of Translation Studies[M]. New York:Routledge, 2014:14. [6] 林煌天.《中国翻译词典》[M]. 武汉:湖北教育出版社, 2005:303. [7] 陈志杰,潘华凌. 回译——文化全球化与本土化的交汇处[J]. 上海翻译, 2008(3):55-59. [8] Gideon Toury．In Search of A Theory of Translation [M]．Tel Aviv: Porter Institute of Poetics and Semiotics, 1980:23-24. [9] Richard Brislin. Back-Translation for Cross-Cultural Research[J]. Journal of Cross-Cultural Psychology, 1970,1(3): 185-216. [10] 王正良. 回译研究[M]. 大连:大连海事大学出版社, 2007:168-215. [11] 万雪梅. 试论汉学翻译[J]. 南京师范大学文学院学报, 2012(1):84-88. [12] Edward Tylor. Primitive Culture: Research into the Development of Mythology, Philosophy, Religion, Art, and Custom[M]. London: John Murray. 1871: 1. [13] 中国社会科学院语言研究所. 新华字典[M]. 北京:商务印书馆, 2004:504. [14] Nida Eugene. Language and Culture: Context in Translating[M]. Shanghai: Shanghai Foreign Language Education Press, 2007:78. [15] Peter Newmark. A Textbook of Translation. Shanghai: Shanghai Foreign Language Education Press, 2001:94. [16] 卞赵如兰. 西方对于京剧研究的情况[C]//中国艺术研究院. 中国戏曲艺术国际学术讨论会论文汇编. 北京:中国戏曲艺术国际学术讨论会秘书处, 1987:319-320. [17] 荣广润. 地球村中的戏剧互动[M]. 上海:上海三联书店, 2007:48. [18] 陈思思. 施高德与中国戏曲[J]. 国际汉学, 2017(1):79. [19] 曹广涛. 基于演出视角的京剧英译与英语京剧[J]. 吉首大学学报:社会科学版, 2011, 32(6):158-160. [20] 李洁. 不觉来到百花亭——魏莉莎的京剧英译实践和京剧英译观[J]. 东方翻译, 2013(1):63-67. [21] 曹广涛. 戏曲英译百年回顾与展望[J]. 湖南科技学院学报, 2011, 32(7):142-145. [22] 张琳琳. 从“青衣”等京剧术语的英译看文化翻译的归化和异化[J]. 上海翻译, 2013(4). [23] 陈艳华. 京剧中的文化专有项英译研究——以京剧行当名称英译为例[J]. 海外英语:翻译研究, 2016(4):94-95. [24] 周琰. 从功能对等论看京剧术语及剧名的英译[J]. 大众文艺, 2010(12):111-112. [25] 董单. 京剧剧名翻译及方法探究[J]. 戏剧之家, 2017(11):269-270. [26] 孙颖. 对外文化交流视域下京剧专门用途英语翻译实践探析[J]. 四川戏剧, 2016(9):40-44. [27] 中国大百科全书总编辑委员会《戏曲曲艺》编辑委员会. 《中国大百科全书：戏曲曲艺》[M]. 北京:中国大百科全书出版社, 1992:171. [28] 黄钧,徐希博. 京剧文化词典[M]. 上海:汉语大词典出版社, 2001. [29] 吴同宾. 京剧知识手册[M]. 天津:天津教育出版社, 1995. [30] 夏征农,陈至立. 辞海[M]. 上海:上海辞书出版社, 2009. [31] 刘畅. 清代宫廷和苑囿中的室内戏台述略[J]. 故宫博物院院刊, 2003(2):80-87. [32] 张淑娴. 清代皇宫室内戏台场景布局探微[J]. 中华戏曲, 2016(1):49-79. [33] 侯希三. 北京老戏园子[M]. 北京:中国城市出版社, 1996. [34] 杜定宇. 《英汉戏剧辞典》[M]. 成都:四川人民出版社, 1990. [35] 廖奔. 中国古代剧场史[M]. 郑州:中州古籍出版社, 1997:142-143. [36] 张发颖. 中国戏班史[M]. 沈阳:沈阳出版社, 1991. [37] 钱南扬. 戏文概论[M]. 北京:中华书局, 2009:203. [38] 傅瑾. 国剧的脚色、行当与人物[J]. 戏剧艺术, 2000(3):67-75． [39] 安利. 脚色•戏曲脚色•角色之正名研究[J]. 牡丹江大学学报, 2009(3):22． [40] 宁翠叶. 体育英语词汇手册[M]. 上海:复旦大学出版社, 2010:159-160. [41] 李玉昆. 简论戏曲表演技法中“五法”的特征与运用[J]. 戏曲研究, 2013(2):256-264. [42] 尚小云. 谈四功五法[J]. 戏曲艺术, 1982(2):18-24. [43] 北京市、上海艺术研究所. 中国京剧史[M]. 北京:中国戏剧出版社, 2000:282. ﹀
公开日期：	2019-06-03

翻译中的原型效应转移策略探究——以《推和敲》为例.杨舒涵

链接

题名：	翻译中的原型效应转移策略探究——以《推和敲》为例
姓名：	杨舒涵
学号：	1601210796
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-27
外文题名：	Strategies for Prototype Shift in English-Chinese Translation: A Case Study of Word by Word
关键词：	翻译方法英汉翻译原型转移法
外文关键词：	Translation methods English-Chinese translation Prototype Shift
论文摘要：	︿　　本次翻译实践基于美国作家柯丽·斯坦珀所著的《推和敲》（Word By Word）一书。这是一本描绘词典编撰幕后工作的著作，类似的题材在国内外都属小众。笔者在试译此书的过程中发现书中表达看似简单却并不易懂，如果按照传统的直译或是意译往往无法准确传达原文语义，需采用原型转移法进行翻译。迄今为止，原型转移法在翻译研究和实践中的运用不多，学界也还未大量开展这方面的研究，这使得笔者对原型转移法产生兴趣，并以此作为研究主题。　　在展开翻译实践前，笔者利用语料分析工具对原文语言进行分析，探讨其中的翻译难点。笔者发现，首先，作者在行文中大量地采用低频词、生僻词，不少词汇不仅没有现成对应的中文翻译，更有的难寻英文背景信息。其次，作者还会在旧词的基础上创造新义，而且不忌讳粗俗语的使用。基于此，并结合大量现有的翻译实例，笔者提出在使用原型转移法时应遵循语境原则、从主原则、等效原则，采取增减达义法、虚实互换法、巧用流行语和谐音转义法的翻译策略，以保证从语义和文化层面准确传达原文的意思，保证译文质量。　　本研究的意义主要体现在以下三个方面：（1）引入认知科学领域的原型理论，并在此基础上进一步探讨原型转移翻译法，为翻译方法提供一个新的视角和途径，克服传统的对翻译方法非直译便意译的认知模式；（2）分析探究原型转移翻译法适用的场景，有助于打破当前仅限于品牌名、电影名等翻译方向的困境；（3）总结整理三大翻译原则和四大翻译策略，能比较有效地解决各类翻译问题。　　总之，原型理论可为翻译提供一个认识翻译的新角度，为一些翻译议题提出新的见解，对当前的翻译实践和理论研究都具有十分积极的意义。﹀
外文摘要：	︿ This paper discusses translation strategies for prototype shift in English-Chinese translations based on the translation practice of Kory Stamper’s book Word by Word. It is a nonfiction work, explicating the lexicographic details and dilemmas encountered by Stamper as an associate editor in Merriam Webster Company. During the process of analyzing the book, this paper finds that it contains many simple words with obscure meanings, which, if translated in traditional ways, for example, word for word or sense for sense, would not convey the correct meaning of the source text. For this reason, this paper proposes to adopt the prototype shift, a concept that originates from cognitive science, to solve the problem. First of all, through literature review, this paper finds prototype shift strategy has not been widely adopted in translation practice, nor has it been extensively explored by researchers. Secondly, analysis of existing translation cases leads to three translation principles, emphasizing the linguistic context, the cultural background and the translation effects, and four translation techniques. Finally, this paper compares prototype shift strategy with some easily confused concepts, for instance, the conversion approach and the free translation. The significance of this study is mainly reflected in the following three aspects: (1) introducing the prototype theory to provide a new perspective and approach for translation; (2) analyzing and exploring the applicable situations for prototype shift; (3) proposing three principles and four techniques to realize prototype shift effectively. ﹀
分类号：	H059
论文总页数：	35
参考文献总数：	29
参考文献列表：	︿ [1] Dryden, J. "The Three Types of Translation." Western Translation Theory: From Herodotus to Nietzsche Ed. Robinson, D. Beijing: Foreign Language Teaching and Research Press, 2006:172-174. [2] Lefevere, Andre, and ping Xia. Translation, Rewriting and the Manipulation of Literary Fame: 翻译、改写以及对文学名声的制控. Shanghai: Shanghai Foreign language Education Press, 2010. [3] Doherty, Stephen M. "Translation in Transition: Between Cognition, Computing and Technology." Journal of Specialised Translation, 2018:353-355. [4] Shreve, Gregory M., and Erik Angelone. Translation and Cognition. Amsterdam: John Benjamins Pub. Co., 2010. [5] 卢卫中, 王福祥. 翻译研究的新范式——认知翻译学研究综述[J]. 外语教学与研究, 2013(4):606-616. [6] Rosch, E.H.. "Cognitive Representaions of Semantic Categories". Journal of Experimental Psychology: General, 1975:192-233. [7] Lakoff, George. Women, Fire, and Dangerous Things: What Categories Reveal about the Mind. Chicago: University of Chicago Press, 1987. [8] 维特根斯坦, 蔡远. 哲学研究[M]. 中国社会科学出版社, 2009. [9] 朱立元. 当代西方文艺理论[M]. 华东大学出版社, 2008. [10] Lapsley, Daniel K. and Benjamin Lasky. ''Protypic Moral Character.'' An International Journal of Theory and Research, 2001: 345-363. [11] Taylor, John R. Linguistic Categorization: Prototypes in Linguistic Theory. England: Clarendon Press, 1989:52-53. [12] 刘夏. 从原型理论视角分析中英文化中“红色”语义对比[J]. 现代交际, 2019(03):96+95. [13] 肖群. 基于原型理论对英语动词多义性的认知语义研究[D]. 成都理工大学, 2017. [14] 夏珺. 基于原型范畴理论的网络新兴词汇研究[J]. 教育教学论坛, 2019(12):203-204. [15] 藏雅楠,卢绍刚. 原型范畴理论下“云XX”的认知社会语言学研究[J]. 现代语文, 2019(02):135-139. [16] 霍克斯特. 结构主义与符号学[M]. 瞿铁鹏,译. 上海译文出版社, 1987. [17] 程雨民. 关于词汇意义[J]. 外语与外语教学, 1999(01):13-14. [18] 王佐良. 翻译:思考与试笔[M]. 外语教学与研究出版社, 1989. [19] Kovecses, Zoltan. Language, Mind, and Culture: A Practical Introduction. England: Oxford University Press, 2006. [20] 龙明慧. 翻译原型研究[D]. 中山大学出版社, 2011. [21] 李勇. 花非花雾非雾——翻译中的原型转移效应[J]. 译苑新谭, 2014(1):55-62. [22] 张培基. 英汉翻译教程[M]. 上海外语教育出版社, 1980. [23] 陈宏薇. 看似容易，实则不易[J]. 中国翻译, 2008(01):88-90. [24] 谭卫国, 蔡龙权. 新编英汉互译教程[M]. 华东理工大学大学出版社, 2009. [25] 徐慧. 从词义表达和词义引申的角度谈英汉翻译[D]. 上海交通大学,2011. [26] 刘宓庆. 新编当代翻译理论[M]. 中国对外翻译出版公司, 北京, 2012. [27] Schmitt, N., and Schmitt, D.. ''A Reassessment of Frequency and Vocabulary Size in L2 Vocabulary Teaching.'' Language Teaching, 2014:484-503. [28] 刘锦.网络热词“直男癌”的建构与颠覆——基于社交媒体女权主义话语符号的分析[J].新闻知识,2017(11):84-87. [29] Lutzky, Ursula, and A. Kehoe. ''Your Blog is (the) Shit A Corpus Linguistic Approach to the Identification of Swearing in Computer Mediated Communication.'' International Journal of Corpus Linguistics 2016:165-191. ﹀
公开日期：	2019-06-04

针对英语词汇石化问题的自适应词块系统研究与设计.王丽君

链接

题名：	针对英语词汇石化问题的自适应词块系统研究与设计
姓名：	王丽君
学号：	1601210744
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软微
导师2姓名：	俞敬松
论文答辩日期：	2019-05-27
外文题名：	Research and Design of An Adaptive Chunk Learning System for the Ease of English Vocabulary Fossilization
关键词：	词汇石化词块教学法自适应学习产出性练习系统设计
外文关键词：	Vocabulary fossilization Lexical approach Adaptive learning Productive exercise System design
论文摘要：	︿语言石化（Fossilization）是中介语（Interlanguage）的特征之一，是指二语学习出现停滞不前甚至倒退的现象。其中，语音、词汇、语法等层面都可能出现石化，而防止或缓解词汇石化现象中的词汇能力石化问题是本研究的核心。大量研究表明，词块（Lexical Chunk）学习能够改善词汇能力石化问题，但传统的词块教学仍有以下三方面的局限：一、无序词块的学习内容忽视了词块间的纵聚合与横组合的语义关系，学习者难以构建词义网络；二、重视记忆过程，缺少产出练习；三、一致的学习内容和方法难以满足学习者的个性化需求。这些局限性容易给学习者造成已会运用词汇的假象，进一步导致词汇能力石化问题。为了改善词汇能力石化问题，本文根据词汇石化、语义网络、词块教学法以及二语习得其他相关理论和自适应学习理念，结合对现有英语词汇学习工具和中国大学生词汇能力石化现状分析，设计了一款针对防止或缓解词汇能力石化问题的自适应词块学习系统，并完成了系统的原型设计。其核心思想如下：一、采用词块教学法培养学习者的词块意识和使用能力，增加词汇语境信息，避免词义直接对等，产生母语负迁移；二、通过学习概念和搭配掌握词块间的关联性，构建并激活学习者的词义网络，增加表达的多样性和准确性；三、设计不同任务复杂度的练习题型，实现对词块从识记到产出的闯关式进阶；四、设置不同学习阶段和教学反馈的自适应规则，让学习内容具有针对性并引导学习者走出词汇使用舒适区，以此来避免用词惰性，改善词汇能力石化问题。本研究在54名本科一年级非英语专业的学生中进行了2周的教学实验。其中10人为先导小组，以确定学习材料及实验细节；其余44人随机分为实验组和对照组各22人，前测证明两组成绩不具有显著性。实验组使用自适应词块学习系统，对照组采用传统的无序词块词表学习，两组均保证学习总量和内容完全相同。实验结束后进行后测，并在10天后进行延时测试。另外，还通过问卷调查和访谈对学习效果和满意度进行了补充验证。研究结果表明：自适应词块学习系统能够提高学习者词汇表达的多样性和准确性，并在缓解词汇能力石化上的保持效果和满意度方面都要优于传统词块学习法。本研究设计的自适应词块学习系统一方面能够有效缓解词汇能力石化问题，提高词汇表达的多样性和准确性；另一方面丰富了词块教学法的研究成果，对课堂教学和英语词汇学习相关工具的设计具有一定的借鉴意义。﹀
外文摘要：	︿ Fossilization is one of the main features of Interlanguage, which means that stagnancy or even backwardness occurs in the process of L2 learning. Fossilization may occur in the aspects of pronunciation, vocabulary and grammar, and the prevention and ease of vocabulary fossilization is the core issue of this study. A large number of studies have shown that the lexical approach can ease the problem of vocabulary fossilization, but the traditional lexical pedagogy still has the following three limitations: First, the disordered chunk learning materials ignore the paradigmatic and the syntagmatic relations of the semantic networks, so that learners can hardly build their semantic networks. Second, it emphasizes the memory process but lacks productive exercises. Third, unified learning materials and methods are difficult to meet the individual needs of learners. These problems can aggravate vocabulary fossilization because learners are probably unable to use appropriate and diversified words in an actual context. In order to make up for the above deficiencies, in light of the theories of language fossilization, semantic network, the lexical approach, other related theories of second language acquisition and the thought of adaptive learning, this paper has designed a lexical chunk learning system for the ease of vocabulary fossilization based on the analysis of the status quo of learners’ vocabulary fossilization and the existing English vocabulary learning tools. This study completed the prototype design of the system. The core ideas of the system are as follows: First, cultivating learners' lexical chunk awareness and its competence by adopting the lexical approach, so that lexical context information can be enriched and the direct equivalence of vocabulary meanings can be avoided. Second, grasping the interrelationship among lexical chunks by acquiring concepts and collocations, so that the semantic networks can be constructed and activated, and the diversity and accuracy of expressions can be enhanced. Third, realizing the process from memorization of chunks to output by setting different types of task complexity exercises. Fourth, setting adaptive rules for learning process and feedbacks, so as to improve learning pertinence and guide learners to move out of the comfort zone of using vacabulary. A two-week teaching experiment was conducted among 54 undergraduate freshmen of non-English majors. Among the participants, 10 were randomly chosen for pilot experiment in order to determine the learning materials and experimental details. The remaining 44 were randomly divided into the experimental group and the control group, 22 participants respectively. The former test proved that the scores of the two groups were not significant statistically. The experimental group acquired chunks by using the adaptive lexical chunk learning system; and the control group used the traditional disordered lexical chunk lists. The total amount of the learning material and the content were all the same to both groups. Post-testing was performed after the end of the experiment, and a delay test was conducted after 10 days of the post-test. In addition, the study applied questionnaires and interviews to acquire the effect and satisfaction of the different learning approach. The research results show that the adaptive lexical chunk learning system is superior to the traditional lexical learning method in terms of the ease of vocabulary fossilization, especially the diversity and accuracy of their expressions. So are the maintenance effect and satisfaction. The adaptive lexical chunk learning system designed in this study, on the one hand, can effectively alleviate the problem of vocabulary fossilization, especially the improvement of accuracy and diversity of expressions. On the other hand, it enriches the researches of lexical approach, and has referencing significance to classroom instruction and the design of vocabulary acquisition tools. ﹀
分类号：	TP3
论文总页数：	107
参考文献总数：	77
参考文献列表：	︿ [1] 加斯 S,塞林克 L.第二语言习得[M].赵杨,译.北京:北京大学出版社. 2011. [2] 蔡基刚.关于我国大学英语教学重新定位的思考[J].外语教学与研究, 2010, 42(4): 306-8. [3] 郑秋萍.心理语言学视角下的二语词汇石化现象分析与防治策略[J].外语研究,2014(06):59-62. [4] 吴旭东,陈晓庆.中国英语学生课堂环境下词汇能力的发展[J].现代外语,2000(04):349-360. [5] Tinkham T.The effect of semantic clustering on the learning of second language vocabulary[J]. System,1993,21(3):371-380. [6] Laufer B. ‘Sequence’and ‘Order’in the Development of L2 Lexis: Some Evidence from Lexical Confusions[J]. Applied Linguistics, 1990, 11(3): 281-296. [7] Laufer B. The development of passive and active vocabulary in a second language: Same or different?[J]. Applied linguistics, 1998, 19(2): 255-271. [8] Laufer B, Paribakht T S. The relationship between passive and active vocabularies: Effects of language learning context[J]. Language learning, 1998, 48(3): 365-391. [9] 崔艳嫣,王同顺.接受性词汇量、产出性词汇量与词汇深度知识的发展路径及其相关性研究[J].现代外语,2006(04):392-400+437-438. [10] Lewis M. The lexical approach[M]. Hove: Language Teaching Publications, 1993. [11] Nattinger J R, Decarrico J S. Lexical phrases and language teaching[M]. Oxford University Press, 1992. [12] 周红云.语言的僵化现象[J].外语界,2003(04):19-26. [13] Torabian A H, Maros M, Subakir M Y M. Lexical collocational knowledge of Iranian undergraduate learners: implications for receptive & productive performance[J]. Procedia-Social and Behavioral Sciences, 2014, 158: 343-350. [14] Selinker L. Interlanguage[J]. IRAL-International Review of Applied Linguistics in Language Teaching, 1972, 10(1-4): 209-232. [15] Richards J C, Schmidt R W. Longman dictionary of language teaching and applied linguistics [M]. Routledge, 2013. [16] Long M H. Stabilization and Fossilization in Interlanguage Development[A]. In Doughty, Catherine J, Michael H. Long, eds. The handbook of second language acquisition[C]. John Wiley & Sons, 2008, 27: 487-535. [17] 戴炜栋,牛强.过渡语的石化现象及其教学启示[J].外语研究,1999(02):11-16. [18] Krashen S. Principles and practice in second language acquisition[J]. 1982. [19] Han Z H. Fossilization: five central issues[J]. International Journal of Applied Linguistics, 2004, 14(2): 212-242. [20] Selinker L. Fossilization as simplification ? [J]. 1993: 197-216. [21] Selinker L, Han Z H. Fossilization: Moving the concept into empirical longitudinal study[A]. In Davis A. Studies in language testing: Experimenting with uncertainty[C].Cambridge University Press, 2001, 27: 276-291. [22]刘座雄.英语写作词汇能力石化现象探析[J].西南民族大学学报(人文社科版),2007(S1):155-158. [23] 石永新.大学生英语写作中的词汇石化现象研究[D].吉林大学, 2017. [24]陈文存.对外语和二语学习者石化现象研究问题的评述[J].外语教学理论与实践,2010(01):89-95+83. [25] 赵文静.母语汉语学生在汉英同传中的负迁移现象[D].北京外国语大学, 2018. [26] Meara P. A note on passive vocabulary[J]. Interlanguage studies bulletin (Utrecht), 1990, 6(2): 150-154. [27] 陈建生.英语词汇教学 “石化” 消解研究[D].西南大学, 2009. [28] 桂诗春.新编心理语言学[M].上海:上海外语教育出版社.2000. [29] Schwartz A I, Kroll J F. Bilingual lexical activation in sentence context[J]. Journal of memory and language, 2006, 55(2): 197-212. [30] 陈玫.从纵聚合和横组合关系看英语写作中的措辞缺陷[J].外语与外语教学,2005(06):32-35. [31] Singleton D M. Exploring the second language mental lexicon[M]. Cambridge: Cambridge University Press,1999. [32] 刘绍龙,傅蓓,胡爱梅.不同二语水平者心理词汇表征纵横网络的实证研究[J].解放军外国语学院学报,2012,35(02):57-60+70+128. [33] 李小撒,王文宇.WordNet与BNC介入下的第二语言心理词汇联系模式实证研究[J].语言科学,2016,15(01):74-84. [34] Jiang N. Lexical representation and development in a second language[J]. Applied linguistics, 2000, 21(1): 47-77. [35] Cowie A P. Phraseology: theory, analysis, and applications[M]. Oxford: Clarendon press, 1998. [36] Becker J D. The phrasal lexicon[A] In Nash-Webber B, Schank R. Proceedings of the 1975 workshop on Theoretical issues in natural language processing [C].Cambridge, Massachusetts, 1975, 60-63. [37] 杨惠中,卫乃兴.中国学习者英语口语语料库建设与研究[M].上海:上海外语教育出版社.2005. [38] Lewis M. Implementing the lexical approach: Putting theory into practice[M]. Hove: Language Teaching Publications, 1997. [39] 周正钟.语块教学法新探—理论, 实证与教学延伸[M].苏州大学出版社. 2014. [40] 贾知辉.词块概念下的高中英语词汇教学实证研究[D].哈尔滨师范大学, 2016. [41] 濮建忠.英语词汇教学中的类联接、搭配及词块[J].外语教学与研究,2003(06):438-445+481. [42] 卫乃兴.中国学习者英语口语语料库初始研究[J].现代外语,2004(02):140-149+216-217. [43] Bychkovska T, Lee J J. At the same time: Lexical bundles in L1 and L2 university student argumentative writing[J]. Journal of English for Academic Purposes, 2017, 30: 38-52. [44] Lu X, Deng J. With the rapid development: A contrastive analysis of lexical bundles in dissertation abstracts by Chinese and L1 English doctoral students[J]. Journal of English for Academic Purposes, 2019,(39)21-36. [45] 郭小宁.中国英语专业学生预制词块鉴别能力研究[D].东北师范大学, 2009. [46] 丁言仁,戚焱.词块运用与英语口语和写作水平的相关性研究[J].解放军外国语学院学报,2005(03):49-53. [47] Krashen S D. Principles and practice in second language acquisition[M]. New York, Oxford: Pergamon,1982. [48] Swain M. Communicative competence: Some roles of comprehensible input and comprehensible output in its development[J]. Input in second language acquisition, 1985, 15: 165-179. [49] 何花.非英语专业研究生英语输出中的“注意”培训研究[D].上海外国语大学,2014. [50] 冯纪元,黄姣.语言输出活动对语言形式习得的影响[J].现代外语,2004(02):195-200+220. [51] 戴运财,戴炜栋.从输入到输出的习得过程及其心理机制分析[J].外语界,2010(01):23-30+46. [52] 王初明.外语写长法[J].中国外语,2005(01):45-49. [53] Hulstijn J H, Laufer B. Some empirical evidence for the involvement load hypothesis in vocabulary acquisition[J]. Language learning, 2001, 51(3): 539-558. [54] 孔繁霞,王歆.任务模式与类型对词汇附带习得的影响研究[J].外语界,2014(06):21-29. [55] 魏梅,王立非.任务类型与频次因素对大学生英语惯用短语学习的影响——对投入量假设的再考察[J].现代外语,2011,34(04):372-380. [56] Vigil N A, Oller J W. Rule Fossilization: A Tentative Model[J]. Language learning, 1976, 26(2): 281-295. [57] Truscott J. The case against grammar correction in L2 writing classes[J]. Language learning, 1996, 46(2): 327-369. [58] Han Y, Hyland F. Academic emotions in written corrective feedback situations[J]. Journal of English for Academic Purposes, 2019, 38: 1-13. [59] Rassaei E. Corrective feedback, learners' perceptions, and second language development[J]. System, 2013, 41(2): 472-483. [60] Bitchener J. Evidence in support of written corrective feedback[J]. Journal of second language writing, 2008, 17(2): 102-118. [61] 蒋景阳.英语作为外语教学的课堂中非刻意负反馈作用的研究[D].上海外国语大学, 2010. [62] Brusilovsky P . Methods and techniques of adaptive hypermedia[J]. User Modeling and User-Adapted Interaction, 1996, 6(2-3):87-129. [63] Weber G, Brusilovsky P. ELM-ART: An adaptive versatile system for Web-based instruction[J]. International Journal of Artificial Intelligence in Education (IJAIED), 2001, 12: 351-384. [64] Alshammari M. Adaptation based on learning style and knowledge level in e-learning systems [D].University of Birmingham, 2016. [65] 廖轶.面向基础教育的自适应学习服务系统研究与应用[D].北京交通大学, 2017. [66] 陆宏,赵艳平.高中英语词汇自适应学习系统的研制[J].现代教育技术,2014,24(11):47-52. [67] Li M, Ogata H, Hou B, et al. Development of adaptive vocabulary learning via mobile phone e- mail[C]//2010 6th IEEE International Conference on Wireless, Mobile, and Ubiquitous Technologies in Education. IEEE, 2010: 34-41. [68] Jung J, Graf S. An approach for personalized web-based vocabulary learning through word association games[C]//2008 International Symposium on Applications and the Internet. IEEE, 2008: 325-328. [69] Lu M. Effectiveness of vocabulary learning via mobile phone[J]. Journal of computer assisted learning, 2008, 24(6): 515-525. [70] 吕京.基于自适应模式的英语阅读教学研究[D].北京大学, 2015. [71] 林毅君.基于自适应学习模式的英语从句语法教学研究[D].北京大学, 2015. [72] 徐亮.基于自适应学习模式的大学英语产出性词汇教学研究[D].北京大学, 2015. [73] 阙颖.面向自适应教学的英语口语资源加工方法的设计与实现[D].北京大学, 2017. [74] 宋凌云.基于自适应学习模式的高中英语听力教学研究[D].北京大学, 2016. [75] Huckin T, Bloch J. Strategies for inferring word-meanings in context: a cognitive model [A]. In Haynes M, Huckin T, Coady J. Second language reading and vocabulary learning[C]. Albex Publishing Corporation, 1993, 153-178 [76] 杨世登.英语学习者产出词汇的发展模式[J].外国语言文学,2007(04):254-259+288. [77] Gardner R C, Lalonde R N, MacPherson J. Social factors in second language attrition[J]. Language learning, 1985, 35(4): 519-540. ﹀
公开日期：	2019-06-19

海外汉学著作精准回译策略研究——以《中国武术：从古代到21世纪》为例.钱康

链接

题名：	海外汉学著作精准回译策略研究——以《中国武术：从古代到21世纪》为例
姓名：	钱康
学号：	1601210677
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-27
外文题名：	Strategies of Accurate Back Translation of Overseas Sinological Works——Taking Chinese Martial Arts: From Antiquity to the Twenty-First Century as an Example
关键词：	海外汉学著作精准回译武术翻译
外文关键词：	Overseas Sinological Works Accurate Back Translation Martial Arts Translation
论文摘要：	︿自上世纪八十年代以来，海外对中国的研究不断加强，出现了越来越多的汉学著作，中国学界也开始重视起这些著作的译介，这些译介在中国的海外汉学研究中扮演了重要角色。在海外汉学著作的翻译过程中，回译问题不可避免，由于海外汉学著作多为学术类著作，行文严谨，措辞谨慎，这也对译者的回译提出了更高的要求，需达到精准回译。本次翻译项目的源文本来自于《中国武术：从古代到21世纪》（Chinese Martial Arts: From Antiquity to the Twenty-First Century），该书介绍了中国武术的发展史，是一本典型的汉学著作，作者为美国著名历史学家龙佩（Peter Lorge）。本文首先阐述了本次研究的背景与意义，介绍了本次翻译的书籍以及所用到的翻译工具，然后从“海外汉学”，“回译”和“中国武术的翻译”三个角度进行了文献综述。在第三章中，笔者基于《中国武术》选译章节的翻译，总结出该书中出现的三大回译现象，即引文回译、词汇回译、以及原文错误之回译。在引文回译中，笔者将引用类型细分成“直接引用”和“间接引用”，提出不同引用类型下，引文的回译处理方式，其中对于“间接引用”的引文进行回译时，还需留意“古今异义”现象的出现；在词汇回译中，笔者将词汇分为“人名”，“武器名”和“武术相关术语”三大类，分别就这三类词汇出现的回译问题进行了探讨；在原文错误之回译中，笔者将错误分为“名称错误”和“史实描述错误”两类，就错误性质进行了定性，并对这些错误的回译方式给出了建议。最后笔者基于本次翻译实践的过程，提出“巧用四字格”、“合理减译”、“归化为主”以及“恪守读者视角原则”四大精准回译策略。本次研究是对武术历史类海外汉学著作精准回译策略研究的初试，以期为同类著作的翻译提供一些参考。﹀
外文摘要：	︿ During the process of translating overseas sinological works into Chinese, back translation is inevitable. Since overseas sinological works are mostly academic works written rigorously and cautiously, they set a high demand on the translator’s skills and abilities. Accurate back translation is therefore needed. The present translation project is based on Chinese Martial Arts: From Antiquity to the Twenty-First Century, authored by Peter Lorge, which discusses the history of Chinese martial arts. This paper summarizes three major problems in the back translation of overseas sinological works, namely citations, special nouns, and errors in the source text. For the back translation of citations, the paper subdivides them into “direct citations” and “indirect citations”, and proposes different methods of back translation. With special nouns, this paper divides them into three categories: “names of people”, “names of weapons” and “martial-arts-related terms”, and discusses their back translation of them. As for errors in the source text, the paper divides them into two types as “errors of names” and “errors of historical description”, and advises on how to deal with them in the back translation. Based on this translation practice, the paper then puts forward four strategies for accurate back translation, namely “using Chinese four-character structure”, “omitting known information to the target audience”, “domesticating translation as the mainstay” and “observing the target-reader perspective”. In conclusion, this paper points out that for accurate back translation of Chinese martial arts, extensive bibliographic search is a prerequisite and careful contextualization of the object of translation is necessary. ﹀
分类号：	H059
论文总页数：	191
参考文献总数：	43
参考文献列表：	︿程裕祯. 关于海外汉学研究[J]. 中国文化研究, 1997(2):118-121. 党晟. 往而复来——漫议西方汉学著作的翻译[J]. 读书, 2018(09):157-164. 丁红艳, 陆志国.也谈文学翻译的原则[J]. 延安教育学院学报, 2004(01):68-70. 方骏. 中国海外汉学研究现状之管见[J]. 国际汉学, 2000(02):9-16. 方梦之. 中国译学大辞典[Z]. 上海外语教育出版社, 2011:97. 冯庆华, 李美. 文体翻译论[M]. 上海外语教育出版社, 2001. 郭沫若. 甲骨文合集[M]. 中华书局, 1999:4541 韩丹. 我国古代东北民族的射柳活动考[J]. 哈尔滨体育学院学报, 2004(1):1-3. 何一民. 海外“中国学”与中国“中国学”[J]. 四川师范大学学报(社会科学版), 2011, 38(01):109-114. 贺显斌. 回译的类型、特点与运用方法[J]. 中国科技翻译, 2002, 15(4):45-47. 胡厚宣. 甲骨文合集释文一[M]. 中国社会科学出版社, 1999:1803. 胡厚宣. 甲骨续存补编[M]. 天津古籍出版社, 1996. 季金珂. 浅谈武术类文本的回译策略[J]. 俄语学习, 2017(05):54-60. 焦丹. 论“一带一路”背景下的中华武术文化翻译及国际传播[J]. 翻译界, 2017:81. 乐黛云. 多元文化发展中的问题及文学可能作出的贡献[J]. 中国文化研究, 2001(1):9-15. 李宁. 英译汉中“四字格”美学价值试析[J]. 新疆大学学报(哲学•人文社会科学汉文版), 2003(s1):161-163. 李长栓. 非文学翻译[M]. 外语教学与研究出版社, 2009:91 卢安. 武术类英文版图书国外发行现状研究与启示[J]. 内蒙古农业大学学报(社会科学版), 2014, 16(3). 鲁迅. 鲁迅全集·且介亭杂文二集[M]. 人民文学出版社, 1981:61-63. 罗安宪. “学而优则仕”的历史流变[J]. 中国社会导刊, 2006(6):14-15. 罗永洲. 中国武术英译现状与对策[J]. 外语教学理论与实践, 2008(4):58-63. 吕洁. 论英译汉中汉语四字格的使用[J]. 当代教师教育, 2002, 19(4):73-76. 钱钟书. 林纾的翻译[J]. 中国翻译, 1985(11):2-10. 万雪梅. 试论汉学翻译[J]. 南京师范大学文学院学报, 2012(1):84-88. 王宏印, 江慧敏. 京华旧事,译坛烟云——Moment in Peking的异语创作与无根回译[J]. 外语与外语教学, 2012(2):65-69. 王宪明. 返朴归真最是信──由几处经典引文回译所想到的[J]. 中国翻译, 1994(4):72-76. 王正良. 回译研究[M]. 大连海事大学出版社, 2007. 王正胜. 回译研究的创新之作——《回译研究》介评[J]. 外语教育, 2009, 9(00):167-170. 谢应喜. 武术翻译初探[J]. 中国翻译, 2008(1):61-64. 徐海亮. 武术翻译四项原则[J]. 中华武术, 2005(1):24-25. 杨伯峻. 论语译注.大字本[M]. 中华书局, 2015. 叶红卫, 刘金龙. 近30年来汉学文献在国内的翻译与出版[J]. 出版发行研究, 2015(5):61-63. 张博. 反义类比构词中的语义不对应及其成因[J]. 语言教学与研究, 2007(1):43-51. 张芳. 汉学论著翻译问题论析——以伊沛霞《剑桥插图中国史》为例[J]. 江苏教育学院学报：社会科学版, 2014(7):93-97. 张西平. 西方汉学研究导论[M]. 学苑出版社, 2007:25. 指文烽火工作室. 中国古代实战兵器图鉴[M]. 中国长安出版社, 2015:66. 周琳. 古今异义成语语义转移的主要类型及成因[J]. 现代语文(语言研究版), 2014(1):42-46. 周庆杰. 杨式太极拳翻译研究[J]. 中国体育科技, 2004, 40(5). 朱明胜. 文化词的翻译——以“麻花”的英译为例[J]. 译林(学术版), 2012(6):180-185. Brislin, R. W. Back-translation for cross-cultural research[J]. Journal of cross-cultural psychology, 1970. Mark Shuttleworth & Moria Cowie. 翻译研究词典[Z]. 外语教学与研究出版社, 2005. Newmark. P. Paragraphs on Translation [M]. Cleveadom: Multilingual Matters Ltd, 1993. Toury. G. In Search of A Theory of Translation [M]. Tel Aviv: Porter Institute for Poetics and Se miotics, 1980. ﹀
公开日期：	2019-06-14

基于语料库方法研究G.K.切斯特顿的反犹问题.窦蕾

链接

题名：	基于语料库方法研究G.K.切斯特顿的反犹问题
姓名：	窦蕾
学号：	1601210504
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	北京大学软件与微电子学院
论文答辩日期：	2019-05-27
外文题名：	Corpus-based Approaches to Anti-Semitism of G.K. Chesterton
关键词：	G.K. 切斯特顿基于语料库 Cohen’s d 反犹主义
外文关键词：	G.K. Chesterton Corpus-based Approaches Cohen’s d Anti-Semitism
论文摘要：	︿吉尔伯特·基思·切斯特顿是20世纪初的英国作家和记者。他生前被指控为反犹主义，如今他是否反犹仍是有争议的问题。笔者使用语料库方法研究两个问题：1､切斯特顿的犹太观点有何显著的特点？２、他的犹太观点特点是否同特定的思维模式相关？研究步骤如下：1､建立切斯特顿几乎全部作品语料库和同时代英国英语参考语料库，使用POS和USAS标注系统进行标注。2､收集一组犹太主题词汇，在切斯特顿语料库中研究它们的搭配，分析出切斯特顿犹太观点的特点。3､笔者认为切斯特顿在不同时期作品中广泛分布的语言特征有可能是他的思维模式的语言表征，因而笔者依据年份信息将切斯特顿语料库分组，同时也将参考语料库分组，使用cohen’s d方法计算两组语料语言特征的效应量差别，并选择cohen’s d值大于0.8的语言特征作为具有关键性的语言特征，将它们视作潜在的切斯特顿思维模式的语言表征。4、筛选出具有关键性的语言特征和搭配的重合部分，并分析它们在切斯特顿语料库中的用法，考察其用法是否与同犹太主题词汇共现时的用法相通，以此揭示切斯特顿思维模式与犹太观点的联系。通过搭配分析，笔者发现：切斯特顿对犹太人在西方世界的存在、犹太人与金钱的关系、犹太人的人际关系都给予了负面的评价；他常常将犹太人分为不同的类型，并对这些类型进行两极化的评价；他多次将犹太教与基督教对立起来；他对犹太人的“身份”给予了一定关注。结合关键性分析，笔者发现切斯特顿的犹太观点存在背后思维模式的支撑：他将基督教作为理解其它其他宗教的参照，因而会将犹太教与基督教对立；他常常发现并展示事物的矛盾之处，而他对犹太人外在身份与实质的矛盾的关注吻合这一思维模式。﹀
外文摘要：	︿ Gilbert Keith Chesterton is a British writer and journalist in the early 20th century. He is accused of Anti-Semitism when he is alive. Whether he is an Anti-Semitist is still a controversial issue today. This study uses corpus lingusitics method and tries to answer two questions: 1.What are the most prominent features of his views of Jewishness? 2. Whether those features are related to his idealogical frame of mind as a whole. The research steps are as follows: 1. Build a corpus of most of Chesterton’s work, as well as a reference corpus of British English roughly of the same era, and tag the corpuses with POS and USAS annotation systems. 2. Collect a set of words of the Jewish theme, and calculate the collocation of lemma and semantic annotation in the Chesterton corpus. Obtain the prominent features of Chesterton’s view of Jewishness through collocation analysis. The author argues that the key linguistic features widely distributed among Chesterton’s works in different times may be the linguistic representation of his general frame of mind. Therefore, this study divides the two corpuses into two groups of texts, using Cohen's d to calculate key linguistic features. According to the benchmark, those linguistics features with Cohen’s d larger than 0.8 are selected as key features and potential representations of his general frame of mind.Then the author filters out the coincidental part of the key features and collocations, and analyzes whether their usage in the Chesterton corpus in general has relations with their usage in collocations with words of the Jewish theme, in order to reveal the connection Chesterton’s view of Jewishness and his general frame of mind. Through collocation analysis the author draws these conclusions: 1、Chesterton has negative opinions on the Jewish people about their existence in the Western world, their relationship with money, their interpersonal relationship with other people; 2. he often divides the Jewish into different types with only negative or positive opinions. 3. he sets up Judaism as the opposite of Christianity; 4. He is concerned about the Jewish identity. When combining those findings with key features analysis, the author finds that Chesterton's Jewish view is supported by his general frame of mind: he uses Christianity as a reference for understanding other religions, and thus pits Judaism against Christianity; he often finds and displays contradictions in things. Therefore, his concern about the contradiction between Jewish external identity and substance is consistent with this mode of thinking. ﹀
分类号：	I56
论文总页数：	60
参考文献总数：	50
参考文献列表：	︿ [1] Dean Rapp. The Jewish response to GK Chesterton's antisemitism, 1911–33[J]. 1990. [2] Owen-Dudley Edwards. Chesterton and Tribalism[J]. The Chesterton Review, 1979, 6(1): 33-69. [3] Simon Mayers. Chesterton’s Jews: Stereotypes and Caricatures in the Literature and Journalism of G. K. Chesterton[M]. CreateSpace Independent Publishing Platform, 2013: 132. [4] Ann Farmer. Chesterton: Religion, anti-Semitism and the Politics of the Underdog[J]. The Chesterton Review, 2008, 34(1/2): 163-186. [5] G. K. Chesterton's Works on the Web. 2019. [6] Leo-A Hetzler. Chesterton's Political Views, 1892-1914, with Comments on Chesterton and Anti-Semitism: to be continued[J]. The Chesterton Review, 1981, 7(2): 119-138. [7] Hitler branded a barbarian. 1933: 14. [8] Anthony Julius. Trials of the Diaspora: A History of Anti-semitism in England. Oxford University Press, 2012: 242-347. [9] Joyce Eisenberg, Scolnic Ellen. Dictionary of Jewish Words: A JPS Guide[M]. Jewish Publication Society, 2010. [10] Steven Beller. Antisemitism: A very short introduction[M]. Oxford University Press, USA, 2015. [11] Todd-M Endelman. Native Jews and Foreign Jews(1870-1914). Berkeley and Los Angeles, California: University of California Press, 2002: 155. [12] William Oddie. Reform,revolution,and the religion of mankind. New York: 2008: 80. [13] Fred Black. A Note on Chesterton and Anti-Semitism[J]. The Chesterton Review, 1977. [14] Kevin-L Morris. Reflections on Chesterton's Zionism[J]. The Chesterton Review, 1987, 13(2): 163-176. [15] Bryan Cheyette. An overwhelming question: Jewish stereotyping in English fiction and society, 1875-1914. University of Sheffield, 1986. [16] Bryan Cheyette. Constructions of'the Jew'in English Literature and Society: Racial Representations, 1875-1945. Cambridge University Press, 1995: 179-205. [17] Anna Vaninskaya. ‘My mother, drunk or sober’: GK Chesterton and patriotic anti-imperialism[J]. History of European Ideas, 2008, 34(4): 535-547. [18] Mike Scott. PC analysis of key words—and key key words[J]. System, 1997, 25(2): 233-245. [19] Marina Bondi, Scott Mike. Keyness in texts[M]. John Benjamins Publishing, 2010. [20] Paul Baker, Gabrielatos Costas, Khosravinik Majid, et al. A useful methodological synergy? Combining critical discourse analysis and corpus linguistics to examine discourses of refugees and asylum seekers in the UK press[J]. Discourse & society, 2008, 19(3): 273-306. [21] Vaclav Brezina. Statistics in Corpus Linguistics: A Practical Guide[M]. Cambridge: Cambridge University Press, 2018. [22] Paul-Edward Rayson. Computational tools and methods for corpus compilation and analysis[J]. 2015. [23] Bill Louw. Irony in the text or insincerity in the writer? The diagnostic potential of semantic prosodies[J]. Text and technology: In honour of John Sinclair, 1993, 157176. [24] Michael Stubbs. Collocations and semantic profiles: On the cause of the trouble with quantitative studies[J]. Functions of language, 1995, 2(1): 23-55. [25] 朱一凡，胡开宝. “被” 字句的语义趋向与语义韵——基于翻译与原创新闻语料库的对比研究. 2014. [26] Peter Stockwell, Mahlberg Michaela. Mind-modelling with corpus stylistics in David Copperfield[J]. Language and Literature, 2015, 24(2): 129-147. [27] Rocío Montoro. The creative use of absences[J]. International Journal of Corpus Linguistics, 2018, 23(3): 279-310. [28] David-L Hoover. Corpus stylistics, stylometry, and the styles of Henry James[J]. Style, 2007, 41(2): 174-203. [29] Fulya Erdentuğ, Musayeva Vefalı Gülşen. What is “old” and “past” in New Age discourse? A qualitative analysis of corpus evidence[J]. Discourse, Context & Media, 2018, 2485-91. [30] Shuki-J Cohen, Holt Thomas-J, Chermak Steven-M, et al. Invisible empire of hate: gender differences in the Ku Klux Klan's online justifications for violence[J]. Violence and gender, 2018, 5(4): 209-225. [31] Sin Yan Eureka Ho, Crosthwaite Peter. Exploring stance in the manifestos of 3 candidates for the Hong Kong Chief Executive election 2017: Combining CDA and corpus-like insights[J]. Discourse & Society, 2018, 29(6): 629-654. [32] Laura-A Cariola. A Corpus‐based Psychodynamic Analysis of Body Boundary Imagery in Hitler's Mein Kampf[J]. International Journal of Applied Psychoanalytic Studies, 2014, 11(4): 318-338. [33] Marcus Bridle. Male blues lyrics 1920 to 1965: A corpus based analysis[J]. Language and Literature, 2018, 27(1): 21-37. [34] Hendrik De Smet, Flach Susanne, Tyrkkö Jukka, et al. The corpus of Late Modern English (CLMET), version 3.1: Improved tokenization and linguistic annotation[J]. KU Leuven, FU Berlin, U Tampere, RU Bochum, 2015. [35] Vaclav Brezina, McEnery Tony, Wattam Stephen. Collocations in context: A new perspective on collocation networks[J]. International Journal of Corpus Linguistics, 2015, 20(2): 139-173. [36] Scott Piao, Bianchi Francesca, Dayrell Carmen, et al. Development of the multilingual semantic annotation system[A]//2015: 1268-1274. [37] Dawn Archer, Wilson Andrew, Rayson Paul. Introduction to the USAS category system[J]. Benedict project report, October 2002, 2002. [38] Paul Rayson. Matrix: A statistical method and software tool for linguistic analysis through corpus comparison. Lancaster University, 2003. [39] Dana Gablasova, Brezina Vaclav, McEnery Tony. Collocations in corpus‐based language learning research: Identifying, comparing, and interpreting the evidence[J]. Language learning, 2017, 67(S1): 155-179. [40] Jacob Cohen. Statistical power analysis for the behavioral sciences. Routledge, 1988. [41] Daniel Lakens. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs[J]. Frontiers in Psychology, 2013, 4863. [42] William-J Crawford, McDonough Kim, Brun-Mercer Nicole. Identifying Linguistic Markers of Collaboration in Second Language Peer Interaction: A Lexico-grammatical Approach[J]. TESOL Quarterly, 2019, 53(1): 180-207. [43] Norman 所罗门 Solomon，文学王广州. 犹太人与犹太教: a very short introduction[M]. 南京: 译林出版社, 2014. [44] Paul Baker, Levon Erez. Picking the right cherries? A comparison of corpus-based and qualitative analyses of news articles about masculinity[J]. Discourse & Communication, 2015, 9(2): 221-236. [45] Patrick Hanks, Hardcastle Kate, Hodges Flavia. A dictionary of first names[M]. New York;Oxford; : Oxford University Press, 2006. [46] Richard-Coates-Peter-McClure Patrick Hanks. The Oxford Dictionary of Family Names in Britain and Ireland[M]. Great Britain., Oxford: Oxford University Press, 2016. [47] Aidan Nichols. GK Chesterton, Theologian[M]. Sophia Institute Press, 2009. [48] Oxford-English Dictionary. "call, n.". [J]. [49] Miles Schmitt. THE ESSAY STYLE OF CHESTERTON[J]. Franciscan Studies, 1943, 3(1): 73-83. [50] Hugh Kenner. Paradox in Chesterton[M]. New York: Sheed & Ward, 1947. ﹀
公开日期：	2019-06-25

2019-05-24

英文汉学著作的汉译：回译和变译.房一品

链接

题名：	英文汉学著作的汉译：回译和变译
姓名：	房一品
学号：	1701212749
论文语种：	chi
专业：	专业学 - 翻译硕士 - 英语笔译
公开时间：	公开
培养层次：	硕士
学位：	翻译硕士专业学位
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	朱源
导师1单位：	外国语学院
论文答辩日期：	2019-05-24
外文题名：	English to Chinese Translation of Sinology Publications: Back-translation and Translation Variation
关键词：	汉学早期中国哲学回译变译
外文关键词：	Chinese Studies early Chinese philosophy back-translation translation variation
论文摘要：	︿本翻译项目源文本取自《早期中国哲学中的情感元素》一书的部分章节。该书是多伦多大学文理学院东亚研究系副教授居里·维拉格（Curie Virág）所著，于2017年由牛津大学出版社在美国首次出版。该书围绕“情感”在早期中国思想家的理论中的地位展开研究，追溯了早期中国哲学概念的谱系, 并考察了它们在古代中国伦理、政治和文化价值观形成中的关键作用。该书分为六个章节，本翻译项目选取了其中的前言、结论和前三章进行翻译，涉及内容包括：孔子《论语》中的情感元素和完整自我、《墨子》对人类社会的重新定义、《道德经》中宇宙欲望和人的能动性。居里·维拉格在哈佛大学东亚语言与文化系取得了博士学位；她的主要研究方向是前现代时期（战国至公元十二世纪）的中国哲学及思想史，已经出版三部学术著作并发表了三十多篇学术论文。作为一部以英语撰写的学术著作，本翻译项目选取的文本具备下列特点：语言风格正式、专业词汇多、名词化场景多、被动句和复合句多。此外，作为一部海外汉学著作，此书涉及大量用英文改译的汉学典故、文献名称和人名头衔名，给翻译造成了一定困难和挑战。﹀
外文摘要：	︿ This translation project is based on The Emotions in Early Chinese Philosophy. Written by Curie Virág and published by Oxford University Press, New York in 2017, this book focuses on the significance of emotions in the theories of early Chinese philosophers, traces the genealogy of these early Chinese philosophical conceptions and examines their crucial role in the formation of ethical, political and cultural values in China. The book consists of six chapters, from which the first three chapters are taken as the source text of this project as well as the part of introduction and conclusion. It gives deep insights into emotions and the integrated self in the analects of Confucius, redefinitions of the human community in Mozi, and the cosmic Desire and human agency in the Daodejing. The author Curie Virág received her Ph.D. degree at the Department of East Asian Languages and Civilizations at Harvard University. She works in the fields of premodern Chinese philosophy and intellectual history (Warring States to 12th century) and has published three academic books and more than thirty papers. As an academic work written in English, the text selected in this translation project has the following characteristics: formal language style, richness in philosophical terms, nominalization, passive sentences and compound sentences. In addition, as an overseas work on Chinese Studies, the source text involves a large number of Chinese allusions, titles of references and names of sinologists, which poses lots of difficulties and challenges for translation. ﹀
分类号：	H31
论文总页数：	14
参考文献总数：	17
参考文献列表：	︿黄忠廉：《变译理论:一种全新的翻译理论》，载于《国外外语教学》，2002年第1期。弘学：《禅林宝训》讲释，成都：巴蜀书社，2006。季进，邓楚，许路：《众声喧哗的中国文学海外传播——季进教授访谈录》，载于《国际汉学》，2016年第2期。焦鹏帅：《变译研究二十年：哲思、发展和国际化》，载于《外语与翻译》，2018年第2期。刘家润：《晦涩词句中的科学观——关于“老子”第一章的解读》，国学网，2006年12月21日。南怀瑾：《老子他说》续集。北京：东方出版社，2010。孙彬：《中国传统哲学概念“理”与西周哲学译名之研究》，载于《哲学与文化研究》，2015年第2期。谭载喜主译：《翻译研究辞典》，Mark Shuttleworth, Moira Cowie著。北京：外语教学与研究出版社，2005。王宏印：《从“异语写作”到“无本回译”——关于创作与翻译的理论思考》，载于《上海翻译》，2016年第3期。王楠：《对汉学论著翻译规范的探讨》，载于《史学月刊》，2002年第4期。吴万伟：《英汉学术翻译中的回译问题》，载于中国英汉语比较研究会《中国英汉语比较研究会第十次全国学术研讨会暨2012英汉语比较与翻译研究国际学术研讨会会议日程和摘要汇编》，2002。许峰：《海外中国学研究的发展前瞻——北京联合大学海外中国学研究中心成立大会暨学术研讨会述要》，载于《中共党史研究》，2012年第11期。叶红卫：《海外英文汉学论著翻译研究》，载于《上海翻译》，2016年第4期。赵旭东译：《帝国的隐喻：中国民间宗教》，Stephan Feuchtwang著。南京：江苏人民出版社，2009。 Virág, Curie. The Emotions in Early Chinese Philosophy. Oxford University Press, 2017. Craig, Edward, ed. Routledge Encyclopedia of Philosophy: Questions to Sociobiology. Vol. 8. Taylor & Francis, 1998. Heim, Michael Henry, and Andrzej W. Tymowski. Guideline for the Translation of Social Science Texts. American Council of Learned Societies, 2006. ﹀
公开日期：	2019-06-13

《译者的取与舍——简析英译汉的异化归化策略》.江皓如

链接

题名：	《译者的取与舍——简析英译汉的异化归化策略》
姓名：	江皓如
学号：	1701212752
论文语种：	chi
专业：	专业学 - 翻译硕士 - 英语笔译
公开时间：	公开
培养层次：	硕士
学位：	翻译硕士专业学位
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	朱源
导师1单位：	中国人民大学外国语学院
论文答辩日期：	2019-05-24
关键词：	历史类文本异化归化策略词语句式修辞思维逻辑
论文摘要：	︿《欧洲海外殖民帝国，1879–1999——一段短暂的历史》是一本历史题材类著作。作者探讨了 19 世纪末至 20 世纪末这一百年间欧洲海外殖民帝国的发展动力和历史轨迹，以及这段交织着欲望与血泪的殖民史对当今世界的种种影响。出于对世界历史的热爱和对历史的反思，笔者选择本书作为翻译实践的对象。在本篇报告中，笔者按照译前准备、译中处理和译后处理的顺序，先是简要回顾了国内外对于历史类文本英译汉的研究情况，再对作者选取的异化归化理论进行大致的介绍，并结合翻译实例，从词语、句式、修辞和思维逻辑四个方面分析得出结论——翻译实践中异化与归化并存，缺一不可，从而回答了译者对原文和译文如何取舍的问题。最后，笔者探讨了这两种翻译策略的研究意义，进一步思考了翻译理论对翻译实践的指导作用以及译者如何提升自身专业素质的问题。笔者希望借此番探讨引起广大翻译爱好者和从业者的共鸣。﹀
分类号：	H059
论文总页数：	277
参考文献总数：	10
参考文献列表：	︿韩烨：《释意理论观下的历史类读物翻译策略》，载于《明日风尚》，2017年第3期。胡开宝、谢丽欣：《论主体间性与英汉词典历史文本翻译》，载于《宁夏大学学报(人文社会科学版)》，2005年第6期。刘蓉：《从英汉民族思维差异看英汉语序》，载于《读与写杂志》，2009年第6卷第5期。刘婷玉：《浅析历史题材类文本的翻译策略——文本类型理论视角》，载于《海外英语（上）》，2017年第7期。刘婷玉：《浅析历史题材类文本英语被动语态的翻译策略——从主语和主题是否一致视角》，载于《海外英语（上）》，2017年第7期。 Newmark, P. Approaches to Translation. New York: Prentice Hall International (UK) Ltd, 1988. Nida, Eugene A. Toward a Science of Translating: With Special Reference to Principles and Procedures Involved in Bible Translating. Boston: Brill, 2003. Reiss, K. Translation Criticism: The Potentials and Limitations. (Translated by Erroll, F.R.) . Manchester: St Jerome Publishing, 1997/2000. (上海教育出版社，2004) Spears, Richard A. McGraw-Hill Dictionary of American Idioms and Phrasal Verbs. New York: McGraw-Hill, 2002. Venuti, Lawrence. The Translator’s Invisibility. Shanghai: Shanghai Foreign Language Education Press, 2009. ﹀
公开日期：	2019-06-25

2019-05-23

汉语“V-的”结构中的“的”及其锚定功能.叶永青

链接

题名：	汉语“V-的”结构中的“的”及其锚定功能
作者：	叶永青
学号：	1601213231
语种：	eng
专业：	文学 - 外国语言文学 - 外国语言学及应用语言学
公开时间：	3年后
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师姓名：	何卫
导师单位：	外国语学院
答辩日期：	2019-05-23
题目(外文)：	The Anchoring Function of de in V-de Construction in Mandarin Chinese
关键字(中文)：	“V-的”结构时态锚定生成语法
关键字(外文)：	Verbal de tense anchoring generative linguistics
文摘：	︿大量的文献探究了生成语言学视角下的时态表达，但是前人对具体语言中的时态系统仍然没有明确定论。中文通常被认为缺乏显性的形态变位，因此学术界对其时态的表达机制有众多的讨论。本文旨在研究中文“V-的”结构中“的” 的时态功能和锚定机制。前人文献里讨论了与“V-的”相似的结构，如分裂句、事态句、焦点结构等等。本文试图将中文“V-的”结构与其它类似形式区分开来，并表明“V-的”结构表现出特殊的句法特性。中文“V-的”结构并不应该被视为和前人讨论的“是…的”等句内部结构一致，也不应被笼统归为是同一结构的不同变体。众多的研究观察表明中文“V-的”句有两个主要的句法表现：其一、时态上，中文“V-的”结构倾向于得到过去时的解读，且这种解读是由功能词“的”带来的。其二、中文“V-的”句与表示将来的时态标记，中文体助词“了”、“着”、“过”，以及句末“了”在句法上并不兼容。在此基础上，本研究对结构的讨论需要回答两个与中文“V-的”结构的句法属性相关的研究问题：第一、这个结构中的“的”如何产生表达偏向过去时的、非未来的时态解读？第二、为什么这个结构中的“的” 在句法上不允许与上述提到的元素共现？本研究在生成句法的视野和最简方案的框架下提出了一个解释，将“的” 视为有指示性质的词项，含有[+指示性]的特征，其功能为锚定事态。锚定的功能在中文的Dº和Tº上同样实现为词素“的”。“的”在“V-的”结构的句法生成过程中其位置从AspPº移动到Tº，最终落脚到Cº。这种论证的原因在理论上有Marantz（2013）的语境异义性（contextual allosemy）概念的支持，并在实践层面可以解释上述提到的中文“V-的”结构的众多句法表现。本文结构上首先简要介绍了中文“V-的”结构的一系列句法表现。文章第一章回顾了以往研究提出的关于时态机制的相关文献。第二章综述探究了在不同理论视角下前人研究对和中文“V-的”结构类似的不同形式的结构的分析，如焦点句、分裂句等。第三章讨论了中文“V-的”句的形式和句法表现，将其和其它的形式区分开来，明确定义了什么是本文讨论“V-的”结构，并进一步展开陈述本文要讨论的问题。文章第四章对词项“的” 在中文“V-的”结构中的句法结构和语义属性进行了解释。本研究旨在考察、描述、分析中文“V-的”结构的时态特性并从句法的层面提出解决方案。本研究的贡献在于帮助未来的研究区分与中文“V-的”结构相似的众多结构，并对后人有关中文“的”、焦点结构、信息结构等研究提供思路。本研究同时也为比较跨语言的时态表达和时态锚定的机制提供了一个视角，为后人讨论汉语时态的系统和时态锚定机制提供了话题。﹀
文摘（外文）：	︿ tense expression has been extensively researched but not adequatelyattested under the generative linguistic paradigm. theories of tense derivation of specific languages abound. mandarin chinese (henceforth chinese)is generally considered to lack overt morphological tense inflection, thus has been the subject of much scholarly debate of various tense related issues. this paper sets out to investigate the tense interpretation of verbal dein the v-de construction in chinese. v-deconstruction has been given many labels in previous literature such as cleft construction, state-of-affairs sentence, focus construction to name but a few. the present study attempts to distinguish v-destructure from other analogous forms and suggests that it demonstrates particular syntactic properties unlike three other structure typespreviously thought to be variations of the samehomogeneous construction as v-de. the current analysis examines two major unresolved puzzles of v-de structure: a) it has been widely recognized to yield preferred past reading and its temporal information is proposed to have been realized via the functional item de; and b) it is incompatible with future tense markers, aspectual auxiliaries le/zhe/guoand sententialle.such distinct properties lead to some inquiries about the syntax ofv-deand the functions of its constituents.this study intends to answer the following questions: a) how does verbalde yield non-future tense reading? and b) why does the structure disallow co-occurrence with the above-mentioned elements? the present study proposes an explanation to account for these syntactic properties from a formal perspective, aligning with the spirit of minimalist program (henceforth mp). it regards verbal de as a featured item in lexicon whose deictic feature could be realized when deis merged either in dº and tº. in both cases, its deicticity fulfills a general anchoring function and its specificity varies in its particular representation on different functional heads. in the analysis of v-de,the temporal reading derived in the construction could be accounted for with the deicticity of dewhen merged in tº. the syntactic process in v-de sentences is argued to be that demoves from asppº to tº and finally cº. the reasons for such an argument is theoretically supported by marantz (2013)’s concept of contextual allosemy and the evidence syntactically attested with chinese verbal deconstructions. this paper first provides a brief introduction to the syntactic behavior of verbal destructure. chapter one reviews relevant literature on tense mechanism proposed in previous studieswhich serve as the groundwork for tense research. the second chapter surveys past studies from different theoretical perspectives both on verbal de and on variant forms of v-deconstructionwhose idiosyncrasy is concealed under various labels such as focus/cleft construction. chapter three discusses the particular form and characteristics of v-destructure and what does not count as v-destructure by examining their syntactic representations and pinning down the exact issues to be addressed in this paper. chapter four offers an explanation of the item deand the basic syntactic and semantic features of v-de structure. this paper not only provides a deive examination of some puzzling structuresbut also puts forward a syntactic explanation of the tense properties of v-de construction, in hope of shedding light on the inquiry of the issueson v-deas well as on tense anchoring in chinese. it meanwhileopensa window into further cross-linguistic comparison in the expression of temporal and aspectual information, thus contributing to the large body of literature on the mechanism of tense system ofanalytic languages in general and chinese in particular. ﹀
分类号：	H04
论文总页数：	60
参考文献数：	91
参考文献：	︿ Adger, D. 2007. Three domains of finiteness: A minimalist perspective. Finiteness: Theoretical and Empirical Foundations. In I. Nikolaeva (ed.). 23–58. Oxford: Oxford University Press. Baker, M. & Travis,L. 1997. Mood as verbal definiteness in a “tenseless” language. Natural Language Semantics 5(3): 213–269. Chao, Y. R. 1968. A Grammar of Spoken Chinese. Berkeley: University of California Press. Chappell, H. & Thompson, S. A. 1992. The semantics and pragmatics of associative DE in Mandarin Chinese discourse. Cahiers de Linguistique—Asie Orientale 21(2): 199-229. Cheng, L. L-S. 2008. Deconstructing the shi de construction. The Linguistic Review 25, 3/4: 235–266. Chiu, B. H. 1993. The Inflectional Structure of Mandarin Chinese. Doctoral dissertation, UCLA. Chomsky, N. 1995. The Minimalist Program. Cambridge, MA: MIT Press. Comrie, B. 1985.Tense. Cambridge: University Press. Deng, S-H. 1979. Remarks on cleft sentences in Chinese. Journal of Chinese linguistics 7 (1): 101-114. Ehlich, K. 1982. Anaphora and deixis: same, similar, or different? In Jarvella & Klein (eds.): 315-338. Encyclopedia of Chinese Languages and Linguistics. 2015. In R. Sybesma, W. Behr, Z. Handel, C.-T. J. Huang& J. Myers (eds.). Leiden: Brill. Gärdenfors, P. & Brala-Vukanović, M. 2018. Semantic domains of demonstratives and articles: A view of deictic referentiality explored on the paradigm of Croatiandemonstratives. Lingua 201: 102-118. Gerner, M. 2009. Deictic features of demonstratives: Atypological survey with special reference to the Miao group. The Canadian Journal of Linguistics / La revue canadienne de linguistique, 54(1): 43-90. Gillon, C. 2009. Deictic features: evidence from Skwxwú7mesh. International Journal of American Linguistics 75(1): 1-27. Grano, T. 2017. Finiteness contrasts without Tense? A view from Mandarin Chinese. Journal of East Asian Linguistics 26(3): 259–299. Heine, B. T. K. 2002. World Lexicon of Grammaticalization. Cambridge: Cambridge University Press. Hinzen, W.& Sheehan, M. 2013. The Philosophy of Universal Grammar. Oxford: Oxford University Press. Huang, C-T. J. 2015. On syntactic analyticity and parametric theory. Chinese Syntax in a Cross-linguistic Perspective, In Audrey Li, Andrew Simpson & Dylan Tsai (eds.). 1-48. Oxford: Oxford University Press. Huang, C-T. J., Li, Y -H. A. & Li Y. F. 2008. The Syntax of Chinese. Cambridge: Cambridge University Press. Klein, W. 1994.Time in Language. London: Routledge. Klein, W., Li, P.&Hendriks, H. 2000. Aspect and assertion in Chinese. Natural Languageand Linguistic Theory 18:723–770. Levinson, S. C. 1983. Pragmatics. Cambridge: Cambridge University Press. Levinson, S. C. 2004. Deixis.The Handbook of Pragmatics.In L. Horn and G. Ward (eds.). 97–121. Oxford: Blackwell. Lin, J-W. 2000. On the temporal meaning of the verbal–le in Mandarin Chinese. Language and Linguistics 1(2):109-133. Lin, J-W. 2002. 论现代汉语的时制意义. Language and Linguistics 3(1): 1-25. Lin, J-W. 2003. Temporal reference in Mandarin Chinese. Journal of East Asian Linguistics 12:259–311. Lin, J-W. 2006. Time in a language without tense: The case of Chinese. Journal of Semantics 23: 1–56. Lin, J-W. 2010. A tenseless analysis of Mandarin Chinese revisited: A response to Sybesma 2007.Linguistic Inquiry 41:305–329. Lin, J-W. 2012. Tenselessness. The Oxford Handbook of Tense andAspect.In R. I. Binnick (ed.). 669–695. Oxford, UK: Oxford University Press. Lin, T-H. J. 2015. Tense in Mandarin Chinese sentences. Syntax, 18 (3): 320-342. Lyons, J. 1977. Semantics. Cambridge: Cambridge University Press. Marantz, A. 2013. Verbal argument structure: Events and participants. Lingua, 130:152–168. Modine, P. 1993. A theory of evolution of the Mandarin focus construction ‘shi…de’. Asian and African Studies (2): 154-168. Ning, C. Y. 1995. De as a functional head in Chinese. Paper presented at the Annual Forum of the Linguistic Society of Hong Kong. Paris. M-C. 1979. Nominalization in Mandarin Chinese: The morpheme de and the shi…de construction, DRL, Universite de Paris 7, Paris. Paul, W. 2005. Low IP area and left periphery in Mandarin Chinese. Recherches Linguistiques deVincennes 33: 111–133. Law, P. &Ndayiragije, J. 2017. Syntactic tense from a comparative syntax perspective. Linguistic Inquiry, 48(4): 679-696. Paul W. & WhitmanJ. 2008. Shi…de focus clefts in Mandarin Chinese. The Linguistic Review 25, 3/4: 413-451. Pollock, J.-Y. 1989. Verb movement, Universal Grammar, and the structure of IP. LinguisticInquiry, 20, 365-424. Pulleyblank, E. 1995. Outline of Classical Chinese Grammar. Vancouver: University of BritishColumbia Press. Reichenbach, H. 1947. Elements of Symbolic Logic. New York: The Macmillan Company. Ritter, E.& Wiltschko, M. 2005. Anchoring events to utterances without tense. In Proceedings ofthe 24th West Coast Conference on Formal Linguistics. In John Alderete et al. (ed.). 343-351. Somerville, MA:Cascadilla Proceedings Project. Ritter, E. &Wiltschko, M. 2009. Varieties of INFL: TENSE, LOCATION and PERSON. Alternatives to Cartography. In Jeroen van Cranenbroeck (ed.), 153–201. Berlin: Mouton de Gruyter. Ritter, E. &Wiltschko, M. 2014.The composition of INFL: An exploration of tense, tenseless languages, and tenseless constructions. Natural Language & Linguistic Theory 32(4): 1331–1386. Roberts, I. 1993. Verbs and Diachronic Syntax: a Comparative History of English and French.Dordrecht: Kluwer Academic Publishers. Simpon, A. Definiteness agreement and the Chinese DP.Language andLinguistics 2: 125–156. Simpson, A. 2002. On the status of ‘modifying’ DE and the structure of the Chinese DP. On the Formal Way to Chinese Languages. In S-W Tang & C-S Liu (eds.). 260-285. Stadford: CSLI Publications. Simpson, A.& Wu, Zoe X-Z. 2002. From D to T — determiner incorporation and the creation of tense. Journal of East Asian Linguistics 11: 169 - 209. Smith, C. S. & Erbaugh, M. S. 2005. Temporal interpretation in Mandarin Chinese. Linguistics 43 (4): 713–756. Soh, H. L., & Gao, M. 2008. Mandarin sentential -le, perfect and English already. Event Structure in Linguistic Form and Interpretation. In J. Dölling, T. Heyde-Zybatow, & M. Schäfer (eds.). 447-473. Berlin: Mouton de Gruyter. Stowell, T. A. 1982. The tense of infinitives. Linguistic Inquiry 13:561-70. Stowell, T. A. 1995. The phrase structure of tense. Phrase Structure and the Lexicon. In J. Rooryck & L. Zaring (eds.). 277-291. Dordrecht: Kluwer Academic Publishers. Sybesma, R. 2007. Whether we tense-agree overtly or not. Linguistic Inquiry 38: 580–587. Tang, T-C. 1983. Guoyu de jiaodian jiegou: fenlieju, fenlie bianju yu zhun fenlieju [Focusing constructions in Chinese: cleft sentences and pseudo-cleft sentences]. Universe and Scope. Presupposition and Quantification in Chinese. In T-C Tang, R. L. Cheng, & Y-C Li (eds.). 127 - 226. Taipei: Student book Co. Teng, H-H. 1979. Remarks on cleft sentences in Chinese. Journal of Chinese Linguistics 7:101–113. Tsai, W-T. D. 2008. Tense anchoring in Chinese. Lingua 118 : 675–686. Warglien, M.&Gärdenfors, P. 2013. Semantics, conceptual spaces, and the meeting of minds. Synthese, 190: 2165-2193 Wiltschko, M. 2003. On the interpretability of tense on D and its consequences for Case Theory. Lingua113:659-696. Wiltschko, M. 2004. Expletive categorical features: A case study of number in Halkomelem. InProceedings of NELS 35 (2).In Leah Bateman, & Cherlon Ussery (ed.). 631–646. Amherst, MA:GLSA Publications. Wiltschko, M. 2014. The Universal Structure of Categories: Towards a Formal Typology. Cambridge University Press. Wu, J-S. 2009. Tense as a discourse feature: Rethinking temporal location in Mandarin Chinese.Journal of East Asian Linguistics 18: 145–165. Xu, Y. 2014. A corpus-based functional study of shi...de constructions. Chinese Language and Discourse 5(2): 146–184. 邓思颖. 2006. 以“的”为中心语的一些问题. 当代语言学(3). 李讷, 安珊笛, 张伯江. 1998. 从话语角度论证语气词“的” 中国语文(2). 李铁根 2002. “了”、“着”、“过”与汉语时制的表达. 语言研究(3). 林若望. 2017. 再论词尾“了”的时体意义.中国语文(1). 刘勋宁. 1985.现代汉语词尾“了”的语法意义. 中国语文(5). 刘勋宁. 1990. 现代汉语句尾“了”的语法意义及其与词尾“了”的联系. 世界汉语教学(2). 吕叔湘主编. 1980. 现代汉语八百词. 商务印书馆. 郭锐. 2015. 汉语谓词性成分的时间参照及其句法后果.世界汉语教学(4). 郭锐. 2016. 汉语叙述方式的改变和“了1”结句现象. 中国语文 (263). 黄正德. 1990. 說「是」和「有」.中央研究院歷史語言研究所集刊 (59). 马学良&史有为. 1982. 说“上哪儿的”及其“的”. 语言研究(1). 麦子茵. 2012. 终结性与“（是）…的”的焦点结构. 语言学论丛(44). 木村英树. 2003. “的”字句的句式语义及“的”字的功能拓展. 中国语文(4). 杉村博文. 1999. “的”字结构、承指与分类. 汉语现状与历史的研究（江蓝生、侯精一主编）.中国社会科学出版社. 石毓智. 2000. 论“的”的语法功能的同一性. 世界汉语教学 (1). 石毓智. 2005. 论判断、焦点、强调与对比之关系—“是”的语法功能和使用条件. 语言研究 25 (4). 石定栩. 2008. “的”和“的”字结构. 当代语言学(4). 宋玉柱. 1981. 关于时间助词“的”和“来着”. 中国语文(4). 史有为. 1984. 表已然义的“的b”补议. 语言研究(1). 完权 2018. “的”和“的”字结构. 上海：学林出版社. 完权. 2013. 事态中的“的”. 中国语文(1). 王文颖. 2016. 现代汉语“是……的”句的焦点结构研究. 博士论文: 北京大学中国语言文学系. 袁毓林. 1995. 谓词隐含极其句法后果—“的”字结构的代称规则和“的”的语法、语义功能。中国语文(4). 袁毓林. 2003a.从焦点理论看句尾“的”的句法语义功能.中国语文(1). 袁毓林. 2003b.句子的焦点结构及其对语义解释的影响. 当代语言学 (4). 朱德熙. 1961. 说“的”. 中国语文(12). 朱德熙. 1978. “的”字结构和判断句. 中国语文(1-2). 朱德熙. 1982. 语法讲义. 北京: 商务印书馆. 朱庆祥. 2017. 也论“应该∅的”句式违实性及其相关问题.手稿. ﹀
公开日期：	2022-06-04

2019-05-20

供应链金融下中小企业信用评级研究 -以工程机械行业为例.孙浩

链接

题名：	供应链金融下中小企业信用评级研究 -以工程机械行业为例
姓名：	孙浩
学号：	1701211051
论文语种：	chi
专业：	专业学 - 工程管理硕士 - 工程管理硕士
公开时间：	公开
培养层次：	硕士
学位：	工程管理硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	张宏岩
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-20
关键词：	供应链金融信用评级因子分析 Logistic回归模型
论文摘要：	︿中小企业在优化经济结构和缓解就业压力等方面呈现出重要的价值，但是受到生产经营规模较小、管理模式落后等因素的制约，中小企业的融资渠道极为狭窄，融资成功率也较低，极大地限制了中小企业进一步发展壮大的步伐。与此同时，国内供应链金融随之应运而生,商业银行等金融机构帮助中小企业周转流动资金，实现多方互利共赢。然而，供应链金融存在信息不对称风险，不同的供应链金融模式所潜在的风险也具有显著差异。随着我国供应链金融行业呈现出迅猛的发展态势，商业银行在经营过程中开始面临在供应链的特殊环境下对中小企业的信用进行风险评估的问题。本文以供应链金融的发展状况作为宏观研究背景，通过对工程机械行业供应链金融融资模式及相应模式下的风险特征的研究，筛选并优化工程机械行业供应链金融信用评价指标，量化工程机械行业供应链金融环境下中小企业潜在的信用风险。首先，本文阐述了研究命题所涉及的相关理论内容，即供应链金融概念、融资模式类型以及相关信用评价体系等；其次，详细阐述了当前工程机械行业供应链金融下不同的融资模式的具体流程及各自的风险特征，从而为构建工程机械行业基于供应链金融环境下的信用指标体系形成良好的前提条件；最后，选取了财务数据完善的工程机械行业中新三板企业作为样本，运用因子分析法对初选的信用指标体系进行降维处理，并利用Logistic回归模型来完成基于供应链金融环境下工程机械行业中小企业信用风险评价体系的构建。本研究构建了工程机械行业基于供应链金融的信用评价指标体系，并检验了指标体系的可行性。本指标体系的设计和实现对工程机械行业中小融资企业具有理论价值和现实意义。﹀
分类号：	F83
论文总页数：	49
参考文献总数：	56
参考文献列表：	︿ [1] 李芹,吴丝丝,霍强.中小企业融资困境与供应链金融创新研究[J].经济论坛，2014（05）：61-67. [2] 宋华.供应链金融[M].二版.北京：中国人民大学出版社，2016：8-13. [3] 丁汀,李雪梅.供应链金融解决中小企业融资的优势分析[J].物流技术，2009（07）：73-75. [4] 李金龙.2011.供应链金融理论与实务[M].北京:人民交通出版社, 5-6． [5] 弯红地.供应链金融的风险模型分析研究[J].经济问题,2008（11）．　 [6] B. A. Ahn, S. S. Cho and C.Y Kim. The integrated methodology of rough set Theory and artificial neural network for business failure prediction. Expert Systems with Applications 2008,18(2):65-74. [7] Dr Clarence N. W. Tan, Bond University, Gold Coast,Qld. A Study on Using Artificial Neural Networks to Develop an Early Warning Predictor for Credit Union Financial Distress with Comparison to the Probit Model[J].Managerial Finance,2011,27(4):56-77. [8] Dadios Kumarasamy, Prakasb Singh. Access to Finance, Financial Development Countries and Firm Ability to Export: Experience from Asia-Pacific countries[J].Asian Economic Journal，2412,32(1). [9] Guilherme Barreto Fernandes. Application of metabolic GM (1,1) model in financial repression approach to the financing difficulty of the small and medium-sized enterprises[J].Grey Systems:Theory and Application,2016,4 (2). [10] Maldonado S, Bravo C, Lopez J, et al. Integrated framework for profit-based feature selection and SVM classification in credit scoring[J]. Decision Support Systems, 2017, (04):113-121. [11] 曾筝.商业银行信用风险评估方法研究[J].计算机仿真，2011,28(08):372-375. [12] 运迪,周建辉.基于改进Z值模型的企业信用风险评估与检验[J].统计与决策，2014(10):173-176. [13] 曾玲玲,潘霄,叶曼.基BP-KMV模型的非上市公司信用风险度量[J].财会月刊，2017(18):47-55. [14] 奚梦缘.中小企业信用指标体系构建及评估模型的最优化[J].经济问题,2018（10）． [15] Shashank Pao, Thomas J. Goldsby. Supply chain risks: a review and typology [J]. The international journal of logistics and management,2009,20（1）：97-123. [16] Demica. Supply chain finance：a third report form Demica[R]. London, UK,2009. [17] Sunil Chopra, Peter Meindl. Suply chain management: strategy, planning and operation [M]. London, UK：Pesrson Pres,2009. [18] Chih-Yang Tsai，On delineating supply chain cash flow under collection risk[J]. International Journal of Production Economics,2010(1):186-194. [19] Bob Dyckman. Integrating supply chain finance into the payables process[J]. International Journal of Production Economics,2011(3):172-180. [20] Abhijeet Ghadge, Samir Dani, Michael Chester，Roy Kalawsky. A systems approach for modeling supply chain risks [J]. Supply chain management：an international journal，2013,18(5):523-538. [21] 张浩.基于供应链金融的中小企业信用评级模型研究[J]．东南大学学报（哲学社会科学版）,2008（2）． [22] 熊熊,马佳,赵文杰.供应链金融模式下的信用风险评价[J]. 南开管理评论,2009(4). [23] 胡海青,张琅,张道宏.供应链金融视角下的中小企业信用风险评估研究——基于SVM与BP神经网络的比较研究[J].管理评论,2012（11）. [24] 夏泰凤,王红梅; 中小企业供应链融资模式的风险管理[J].经济导刊,2012（1）. [25] 郭战琴.基于供应链金融的小微企业融资模式——以第三方龙头物流企业为平台[J].金融理论与实践,2012(1):76-83. [26] 陈长彬,盛鑫.供应链金融中信用风险的评价体系构建研究[J].福建师范大学学报(哲学社会科学版) ,2013(2). [27] 黄静思,宋河,宋新红.供应链金融贷款风险识别与评价方法研究.金融理论与实践[J]. 2014 (2):46-49. [28] 胡慧慧,傅为忠.基于改进灰色关联度方法的互联网供应链金融风险评价[J].武汉金融.2016 (3) :51-55. [29] 高翔,贾亮亭.基于结构方程模型的企业跨境电子商务供应链风险研究——以上海、广州、青岛等地167家跨境电商企业为例[J].上海经济研究,2016(05):76-83. [30] Angapp Gunasekaran，Kee -hung Lai，T.C. Edwin Cheng.Responsive supplly chain：a competitive strategy in a networked economy[J]. The international journal of management science,2008,36:549-564. [31] Bernabucci R.J. Supply chain gains from integration[J]. Financial Executive,2008,24（3）：46-48. [32] Bing Jing，Abraham Seidmann. Financing sourcing in a supply chain [J]. Decision support systems,2014,58(2):15-20. [33] 赵亚娟,杨喜孙,刘心报.供应链金融与中小企业信贷能力的提升[J].金融理论与实践,2009（10）． [34] Bob Dyckman. Supply chain finance：risk mitigation and revenue growth [J]. Journal of corporate treasury management,2011,4(2):168-173. [35] Camerinelli D. Supply chain finance[J]. Journal of Payments Strategy & Systems,2009,3(2):114-128. [36] Cossin D, Hricko T. A structural analysis of credit risk with risky collateral： A methodology for haircut determination [J]. Economic Notes,2003, 32(2):243-282. [37] 贾俊平，何晓群，金勇进.”十二五”普通高等教育本科国家级规划教材,21世纪统计学系列教材「Ml.中国人民大学出版社,2012,(05):33-57. [38] 杨丹清.供应链金融背景下中小企业融资模式探究[J].合作经济与科技,2016(03):50-51. [39] Chih-Yang Tsai. On delineating supply chain cash flow under collection risk [J]. International journal of production economics,2011,129（1）：186-194. [40] David A. Wuttke, Constantin Blome, Michael Henke. Focusing the financial flow of supply chains: an empirical investigation of financial supply chain management [J]. International journal of production economics,2013,145（2）:773-789. [41] Epley R. Donald, Liano Kartono,Haney Richard. Borrower risk signaling using loan-to-value ratios[J]. Journal of Real Estate Research,1996,11（1）:71-86. [42] F. Mathis, J. Cavinato. Financing the Global Supply Chain： Growing Need for Management Action [J]. Thunderbird International Business,2010,52（6）:467-474. [43] 张文春.供应链金融视角下中小企业融资路径分析[J].商业时代,2010(26):85-116. [44] Hans -Christian Pfohl, Moritz Gomm. Supply chain finance: optimizing financial flows in supply chains [J]. Logist Research,2009（1）:149-161. [45] M. Theodore， Paul D. Hutchison. Cash-to-cash： the new supply chain management metric[J]. International journal of physical distribution & logistics management,2002,32 (4):288-298. [46] Miao He, Changrui Ren, Qinhua Wang, Jin Dong. Chapter 3:supply chain finance:concept and modeling [C]// Feiyue Wang. Service science management and engineering. Hangzhou: Zhejiang University Press,2012:37-58. [47] Mingsheng Yang. Research on supply chain finance pricing problem under radnom demand and permissible delay in payment[J]. Procedia computer science,2013(17):245-257. [48] P.L. Abad,C. K. Jaggi. A joint approach for setting unit price and the length of the credit period for a seller when end demand is price sensitive [J]. International journal of production economics, 2003(83):115-122. [49] Peter Finch. Supply chain risk management [J]. Supply chain management：an international journal,2004,9(2):183-196. [50] Rhian Slivestro, Paola Lustrato. Integrating financial and physical supply chain：the role of banks in enabling supply chain integration [J]. International journal of operations & production management,2014,34（3）：298-324. [51] Tseng, M.L., Chiang, J.H., Lan, W.L. Selection of optimal supplier in supply chain management strategy with analytic network process and choquet integral. Comput[J]. Ind. Eng. 2009,57 (1): 330-340. [52] Wesley S. Randall， M. Theodore Farris. Supply chain financing：using cash -to -cash variables to strengthen the supply chain [J]. International journal of physical distribution & logistics management,2009,39(8): 669-689. [53] Shang, K.H., Song, J.S., Zipkin, P.H. Coordination mechanisms in decentralized serial inventory systems with batch ordering. Manag. Sci. 2009, 55 (4):685-695. [54] Vickery, Jayaram, Droge & Calantone. The effect of an integrative supply chain strategy on customer service and financial performance： an analysis of direct versus indirect relationships [J]. Journal of operations management,2003,21(5):523-539. [55] Wuttke, D.A., Blome, C., Heese, H.S., Protopappa-Sieke, M. Supply chain finance: optimal introduction and adoption decisions. Int. J. Prod. Econ. 2016,178: 72-81. [56] Xiangjun He, Lingyun Tang. Exploration on building of visualization platform to innovate business operation pattern of supply chain finance [J]. Physics procedia,2012(33):86-93. ﹀
公开日期：	2019-06-03

国际视角下建筑行业协会合作对建筑职业培训效果影响的研究.田志伟

链接

题名：	国际视角下建筑行业协会合作对建筑职业培训效果影响的研究
姓名：	田志伟
学号：	1701211055
论文语种：	chi
专业：	专业学 - 工程管理硕士 - 工程管理硕士
公开时间：	公开
培养层次：	硕士
学位：	工程管理硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	张宏岩
导师1单位：	软件与微电子学院
论文答辩日期：	2019-05-20
外文题名：	Research on effect of construction industry vocational training from the perspective of international cooperation among NGOs
关键词：	协会国际合作建筑职业培训博弈
外文关键词：	International cooperation between industry associations Vocational training Game theory
论文摘要：	︿我国建筑业科技水平相对较低，从业者安全意识和专业知识相较于发达国家有所不足，导致建筑安全事故较多，给工人生命安全和经济发展带来了危害，加之国内建筑职业培训成效不足，使得目前建筑工人技能水平和职业素质达不到行业发展需要。有效的培训能够实现建筑工人专业技能和职业素养的提高。海外职业培训经验表明，良好的职业培训效果是工人、政府、企业、行业协会多方参与、良性互动、有机融合的结果。本文首先分析了当前国内外建筑安全形势和职业培训的特点，通过对建筑企业、农民工和国外协会负责人的调研访谈，发现了我国建筑业农民工职业培训的现状及问题，对国内建筑培训成效不足的原因进行深入分析。针对存在的问题，借鉴了国际先进职业培训项目的经验，并着重分析研究了中外建筑行业协会合作开展培训项目的巨大价值和前景。作者通过全面分析英美职业培训的特点，论述协会开展国际职业培训合作对国内建筑业发展的积极意义。文章用博弈论对建筑职业培训中工人、建筑企业、协会和政府之间的博弈关系进行了分析。职业培训主要涉及政府企业间博弈、农民工和企业的博弈以及培训项目提供和参与者之间的博弈。通过加入行业协会的角色，利用协会在国际交流、信息收集、专业知识等方面的独特优势，讨论了中外协会国际合作背景下博弈结果优化的可能性，从而实现吸收英美国家职业培训经验，提高职业培训效果的目的。通过作者对国内建筑企业，建筑工人和中外行业协会的调研数据，对参与中外职业培训合作项目的中国建筑企业和农民工的收益成本进行了实证分析，并提出了改善职业培训效果，促进中国建筑业发展，提升工程安全水平的建议。﹀
外文摘要：	︿ Backwardness of construction technology and weakness of safety consciousness and the lack of professional skills among migrant workers lead to construction accidents in China,which poses severe threat to lives, families and economy in general. There is no sufficient training program for migrant workers, making this situation even worse. Because the lack of effective training programs, migrant workers do not possess necessary skills for safety, hence unable to meet the requirement of overseas construction projects. Vocational training aims to improve the skills and the knowledge for migrant workers. Study of foreign vocational training drew conclusion that effective training is a result of involvement of workers, organizations, enterprises and the government, who has positive interaction and better integration. At the beginning, paper introduces the trends of global construction safety and the traits of vocational training. Through questionnaire for enterprises, migrant workers and industry associations. The author has developed comprehensive understanding of current construction industry training system and identified its existing problems. In order to address ineffectiveness of domestic training program, factors that lead to the problems through experiences of foreign vocational training and literature review are examined. It is international cooperation among associations from different countries that can make training program meaningful and generate huge benefits for the whole industry. The author deducted reasons by thorough analysis of the training experiences in United States and United Kingdom. With Game theory, the stakeholders of construction industry are analyzed. The main role of this model are migrant workers, construction enterprises, industry associations and the government, respectively. The paper shows that we can optimize the results of game compared to the previous model that industry association did not participate by adding the role of association into the game model of vocational training due to the unique advantages the association possesses, such as international cooperation, information gathering and professional knowledge. The study targets at improving the effectiveness of vocational training, using collected data to empirically analyze the migrant workers, enterprises involved in the vocational training program between China and US, and UK. Finally, The paper give some suggestions on how to improve the effectiveness of vocational training on the hope of putting forward to further promote the development of China's construction industry and the safety of engineering. ﹀
分类号：	F26
论文总页数：	62
参考文献总数：	74
参考文献列表：	︿陈圆，任宏.美国建筑业劳工培训剖析与启示[J]. 《建筑经济》， 2010 (9) :13-16 程贵妞，韩国明.行业协会参与职业教育的角色分析[J].教育与职业,2008(6):11-14 方东平等.英国和美国建筑安全的现状与发展[J]. 《建筑经济》， 2001 (8) :26-29 国家统计局.中华人民共和国 2018 年国民经济和社会发展统计公报[EB/OL]. http://www.stats.gov.cn/tjsj/zxfb/201902/t20190228_1651265.html 韩永光.建筑业农民工职业教育管理研究[J].中华民居(下旬刊)， 2014（9） :239-240 黄浩明.社会组织国际化战略与路径研究[D].天津大学,2014 赖涪林，付春，肖升生.农民工教育培训参与主体的博弈与抉择分析[J]. 《唯实》， 2012 (10) :80-82 李洵.新加坡、英国及香港地区的建筑质量与安全分析[J]. 《土木工程学报》 , 2003 , 36 (9) :38-45 李梦白.美国汽车工程师协会（SAE）教育培训管理及课程体系简介之一——SAE 的职业培训管理[J].质量与可靠性, 2009(2):58-59 李朝.建筑业农民工安全管理研究及应用[D]. 湖南大学,2016 刘璐.英国建筑安全发展概览[J]. 《中国安全生产》， 2015 (12) 刘志军.建筑业农民工教育培训体系构建及对策研究[D]. 东南大学,2016 刘能文.2016 年全国建筑物资租赁承包行业分析报告[R]北京：中国基建物资租赁承包协会， 2016：1-3 毛亚男.行业协会参与职业教育人才培养模式研究[D]. 天津大学,2013 牛永宁，蔡庸亨，牛新可.英国建筑安全教育培训分析与借鉴[J].《建筑安全》， 2015（11）： 7-9 冉云芳.企业参与职业教育办学的成本收益分析[D]. 华东师范大学,2016 申英博.基于博弈理论的建筑安全管理研究[D]. 天津大学,2015 寺田盛纪.日本职业教育——比较与就业过程视角下的职业教育学[M].陈俊英,马丽华,译.北京:人民教育出版社,2014:25. 孙萌.非营利组织的国际化策略与资源的多重依赖——以北京某基金会为例[D]. 2012. 谭璐.中国非学历教育与个人收入关系的实证研究[J].《开放学习研究》， 2018（12）： 31-36 王奕俊.企业收益成本视角的校企合作动力机制分析[J].《教育与职业》， 2011 (03) :15-17 魏体丽.澳大利亚行业技能委员会研究[D].华中师范大学,2013 许华榕.闽台行业协会交流与合作深化问题的研究[D].华侨大学,2011 许惠清,黄日强.以行业为主导的职业教育模式[J].河北师范大学学报,2011（9）:79-84 徐振.基础设施项目施工企业应对“用工荒”问题的研究[D]. 清华大学,2014 徐卫.新生代农民工职业培训研究[D]. 武汉大学,2016 燕晓飞.非正规就业劳动力教育培训的多主体博弈分析[J].东北师大学报(哲学社会科学版)， 2013（2） :144-147 张健.浅析行业协会的功能——基于弥补市场失灵的视角[J].理论界, 2013(6):28-30. 张沁洁.行业协会间的竞合关系演变研究——以广东为例[J]. 华南理工大学学报(社会科学版), 2018,v.20； No.102(02):77-86 郑茜.基于博弈论视角下中国农民工职业培训问题研究[J]. 《知识经济》， 2009 (14) :69-70 中华人民共和国国务院办公厅. 国务院办公厅关于加快推进行业协会商会改革和发展的若干意见. 国办发[2007]36 号[J].工程造价管理, 2007(52):3-5. 中国基建物资租赁承包协会.协会介绍 [EB/OL]. [2015-10]. http://www.ccmrc.org.cn/about.asp?id=369 中国建筑业协会.2017 年建筑业发展统计分析 [EB/OL]. [2018-01]. http://www.zgjzy.org/NewsShow.aspx?id=9146 周丽华 . 辅助原则与德国 “ 双元制 ” 职业教育中经济组织的主体地位 [J]. 外国教育研究,2015(2):117-128. 朱钰.基于建筑工人认知的安全行为培训研究[D]. 清华大学,2016 赵彬，袁亮，杨希宁.建筑业农民工技能培训障碍与对策研究[J].《建筑经济》 , 2017, 38(12) :100-104 Acemoglu D, Pischke .1-S. The structure of wages and investment in general training. [J]. Journal of Political Economy. 1999107(3). 539-572 ABET.abet accreditation[EB/OL]. [2010-06].https://www.abet.org/accreditation/ Becker G S, Tomes N. Human capital and the rise and fall of families[J]. Journal of Labor Economics,1986, 4(3, Part 2):S1-S39 BEA.2017industry stat data[EB/OL]. [2018-06]. https://apps.bea.gov/industry/factsheet/factsheet.cfm Centre for information on continuing vocational training.A bridge to the future European policy for vocational education and training 2002-10-- National policy report 一 France[DB/OLJ. March 2010/2012-06-09. p.14 Centre for information on continuing vocational education and training 2002-10-- National policy training.A bridge to the future European policy for vocational report 一 France[DB/OL]. March 2010/2012-06-09.p.27 CISRS .CISRS handbook [EB/OL]. [2016-10]. http://www.cisrs.org.uk/ Dietrich H,Koch S,Stops M.The apprenticeship places crisis: training needs to be worthwhile,including for companies.Establishment Panel Survey[R].Nuremberg, Brief Report,2004, No. 6. Edward L. Taylor .Safety benefits of mandatory OSHA 10 h training [J].Safety Science, Volume 77,August 2015, Pages 66-71 Granger, CWJ1, Some Recent Developments in a Concept of Causality [J].Journal of Econometrics,1988,39: 199～2111 Harsanyi J C, Selten R.A Generalized Nash Solution for Two-Person Bargaining Games with Incomplete Information [J]. Management Science, 1972, 18(5-part-2):80-106. Hinze. Analysis of Fatalities Record by OSHA. [J].Journal of Construction Engneering and Management,1995, (6): 23-25. Hansen, Hal. Caps and Gowns [D]. University of Wisconsin-Madison, 1997. H.Rauhut. Higher Punishment, Less Control? Experimental evidence on the inspection game .[J]Rationality and Society.2009,21(21):359-392 Hinze J, Harrison C. Safety Programs in Large Construction Firms [J]. Journal of the Construction Division, 2014, 107(3):455-467. Health and Safety Executive. Construction: Work related injuries and ill health [EB/OL]. [2017-10]. http://www.hse.gov.uk/statistics/industry/construction/construction.pdf Juan Carlos Rubio-Romero.Analysis of the safety conditions of scaffolding on construction sites.[J]. Safety Science, Volume 55, June 2013, Pages 160-164 JIFH. Japan International Food for the Hungry[EB/OL]. [2011-08].https://www.jifh.org/eng/activity/ Lewis W A . Economic Development With Unlimited Supplies Of Labour[J]. Manchester School, 1954,22:139-191. Lehrack D. Environmental NGOs in China - partners in environmental governance [J]. Discussion Papers Presidential Department, 2006. Maslow A H. Preface to motivation theory.[J]. Psychosomatic Medicine, 1943, 5(1):85-92. Mincer J. Schooling, Experience, and Earnings. Human Behavior & Social Institutions No. 2. [M]//Schooling, experience, and earnings. 1974. Mincer J. Human capital and economic growth. [J] Economics of Education Review, Volume 3, Issue 3,1984, Pages 195-205. Muehlemann S.Schweri J.Winkelmann R, Wolter S C. A Structural Model of Demand for Apprentices[R].CESifo Working Paper. 2005, No.1417. Muehlemann S,Schweri J,Winkelmann R Wolter S C. An empirical analysis of the decision to train apprentices [J]. Lab Rev Lab Econ Ind Relat, 2007, 21(3):419-441 Nash J.Two-Person Cooperative Games [J]. Econometric , 1953, 21(1):128-140. OSHA. Introduction of OSHA [EB/OL]. [1985-10]. http://www.osha.gov/ Qualifications and Curriculum Development Agency. UK National Policy report for 2010[DB/OL].2010/2012-09-26. p.64 Ryan P.Gospel H,Lewis P.Educational and Contractual Attributes of the Apprenticeship Program of Large Employers in Britain [J]. Journal of Vocational Education and Training. 2006.58(3):359-383. Shaked A, Sutton J. Involuntary Unemployment as a Perfect Equilibrium in a Bargaining Model[J].Econometrica, 1984, 52(6):1351-1364. Strauss A L, Corbin J M. Grounded theory in practice [M]. Grounded theory in practice. 1997. Starbird S A. Designing Food Safety and Penalties for Noncompliance Regulations: The Effect of Inspection Policy on Food Processor Behavior [J].Journal of Agricultural and ResourceEconomics_2000， 25 (2） :616-635. Sou-Sen Leu, Ching-Miao Chang. Bayesian-network-based safety risk assessment for steel construction projects [J]. Accident Analysis & Prevention. 2013(54):122-133. SAIA. Introduction of SAIA [EB/OL]. [2014-05]. https://www.saiaonline.org/aboutsaia Sevilay Demirkesen.Construction safety personnel's perceptions of safety training practices[J].International Journal of Project Management, Volume 33, Issue 5, July 2015, Pages 1160-1169 Theodore W. Schultz, investing in people: Schooling in low income countries [J]. Economics of Education Review, Volume 8, Issue 3, 1989, Pages 219-223 Von Neumann J, Morgenstern O. Theory of Games and Economic Behavior [M]. 1953. Wheeler N. Invited influence: American private associations in the modernization of China, 1985--2005[J].Dissertations & Theses - Gradworks, 2007. ﹀
公开日期：	2019-06-11

2018-11-30

中国技术写作认证考试设计与实证.阮羽

链接

题名：	中国技术写作认证考试设计与实证
姓名：	阮羽
学号：	1401210700
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	何卫
导师1单位：	外国语学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2018-11-30
外文题名：	The Design and Verification of the Technical Writing Certification for Chinese Technical Writers
关键词：	技术写作能力要求认证考试
外文关键词：	Technical writing Competency requirements Certification examination
论文摘要：	︿随着中国经济水平的提升，许多企业和高校意识到技术写作在产品销售、用户满意度中占据越来越重要的地位，开始重视高校人才培养和企业人才输送，同时急需一套人才选拔的基准帮助企业寻觅人才。目前，欧美国家有相对完善的技术写作认证考试，例如美国技术传播协会的CPTC认证考试和德国技术传播协会的TCTrainNet认证考试。但是，把这些认证考试直接平移到中国市场是不恰当的，存在以下几个问题：第一，国外的认证考试内容不能对应中国的技术写作岗位要求及其能力要求；第二，时代和科技的进步对技术写作提出了新的要求，比如内容设计、写作要求和质量控制等。第三，国外的认证考试注重技术写作理论知识的传达，对实践操作的考核几乎没有涉及。针对以上问题，笔者提出了根据中国技术写作行业需求设计认证考试的研究，并明确了研究思路和方法。首先，本文主要凭借“工作任务”界定能力构成，解释被试需要掌握的能力。笔者通过企业招聘信息、行业从业人员访谈、技术写作课程和已有技术写作考试，总结得出技术写作从业人员需要掌握分析、设计、写作、质量控制、发布这五块能力。其次，笔者根据前文获得的设计依据，制定了中国技术写作认证考试的大纲，并采用专家评定法对考试大纲进行了交叉验证，验证大纲的有效性。接着，笔者根据技术写作考试内容特点，讨论了各题型的适用性，并提出了各题型的设计方法。然后，笔者根据技术写作特点和已有考试评分标准，讨论了本次研究的评价标准。最后，根据技术写作考试大纲和考试方法，笔者展开了三次实验，第一次实验对象为工作两年以上的技术写作从业人员，第二次实验对象为从事技术写作半年内的技术写作从业人员，第三次实验对象为北京大学计算机辅助翻译2017级的学生，测试结果验证了样卷的可靠性和有效性。研究结果表明，本次研究的技术写作认证考试大纲和考试设计方法，具备有效性、可信性和可行性。设计的考试既符合了中国市场的需求，又满足了新时代对人才的新需求。希望本文提出的考试设计能启发和鼓励更多企业和行业关注技术写作行业的发展完善和人才的培养。﹀
外文摘要：	︿ As China’s economy has been improved, many enterprises and universities are aware that technical writing plays an increasingly important role in product sales and customer satisfaction. How to train technical writers and evaluate their output has become a concern. At present, there are many certification exams related to technical writing, such as the CPTC certificate exam of the Society for Technical Communication, and the TCTrainNet certificate exam of tekom. However, these above-mentioned exams can’t be applied to China. Firstly, the content of foreign certification exams does not always fit Chinese job requirements. Secondly, the progress of the era and technology has posed new challenges to technical writing, such as content design, writing requirements, and quality control. Thirdly, foreign certification exams focus more on the theoretical knowledge while practice assessments are barely involved. This paper puts forward the design of the technical writing certificate according to the demand of China's technical writing industry, and expounds research methods. To begin with, this paper defines the composition of capabilities by job analysis. Through the enterprise recruitment information, industry interviews, technical writing courses and existing technical writing certificates, five major capabilies are conclude: analysis, design, writing, quality control and release. Then, this paper determines the outline and details of the Chinese technical writing certificate. The author uses expert method to cross-validate the examination outline. Next, the author discusses the applicability of each question type according to the content of the technical writing, and puts forward the design method of each question type. Afterwards, the author discusses the evaluation criteria of this study based on the characteristics of technical writing and the existing scoring criteria. Finally, according to the previous work, the author carries out three experiments. The first type of experimental subject is technical writers who have worked for more than two years; the second type is technical writers who are engaged in technical writing for half a year, and the third is students majoring in Computer-Aided Translation in Peking University. The test results verify the reliability and validity of the sample test. The results prove that the design of the technical writing certificate is effective, credible and feasible. The design meets the needs of the Chinese market. The author hopes that the design proposed in this paper can inspire and encourage more enterprises and industries to pay attention to the development of the technical writing industry and talent cultivation in the technical writing industry. ﹀
分类号：	G40
论文总页数：	84
参考文献总数：	57
参考文献列表：	︿陈明庆. 考试研究方法导论[M]. 北京大学出版社, 2009. 陈宇. 职业资格考试概论[M]. 华中师范大学出版社, 2002. 陈宇. 我国职业资格证书制度的回顾与前瞻[J]. 教育与职业, 2004(1):17-19. 戴海琦. 心理测量学[M]. 高等教育出版社, 2015. 郭伟萍. 英国职业资格证书制度的研究[D]. 天津大学, 2005. 黄锐. 标准参照语言测试研究[M]. 厦门大学出版社, 2012. 中国技术传播联盟. 2017中国技术传播发展现状调查报告[DB/OL]. http://www.tc-china.org/2017中国技术传播发展现状调查报告/，2018. 李梅. 技术传播性质课程的设计与实现探索——以同济大学实用英语写作课为例[J]. 上海理工大学学报(社会科学版), 2017, 39(2):101-107. 李金波. 让考试更科学[M]. 武汉大学出版社, 2012. 李清华. 高校英语专业四级测试写作评分标准的设计与效度研究[M]. 科学出版社, 2014. 李双燕. 中国技术传播教育研究浅述[J]. 文化与传播, 2015(6). 柳博. 考试命题制度研究[M]. 高等教育出版社, 2017. 吕忠民. 职业资格制度概论[M]. 中国人事出版社, 2011. 苗菊, 高乾. 构建MTI教育特色课程——技术写作的理念与内容[J]. 中国翻译, 2010(2):35-38. 史庆. 英国的国家职业资格证书制度[J]. 全球教育展望, 1997(6):47-52. 陶百强, 陈效. 我国高考英语考试大纲(说明)的问题与思考[J]. 教育与考试, 2008(4):29-34. 田大洲. 我国职业资格证书制度研究[D]. 首都经济贸易大学, 2004. 徐奇智, 王希华. 技术传播学:美国的发展对我们的启示[C]// 亚太地区媒体与科技和社会发展研讨会. 2006. 杨惠中，C.Weir. 大学英语四、六级考试效度研究[M]. 上海外语教育出版社, 1998. 杨惠中, 朱正才, 方绪军. 中国语言能力等级共同量表研究: 理论, 方法与实证研究[J]. M]. 上海: 上海外语教育出版社, 2012. 杨延. 国家职业资格认证考试的国内外比较研究[J]. 职教论坛, 2006(5s):46-49. 俞敬松, 王惠临, 王聪. 翻译技术认证考试的设计与实证[J]. 中国翻译, 2014(4):73-78. 张凯. 汉语水平考试(HSK)研究[M]. 商务印书馆, 2006. 中华人农民共和国国家质量监督检验检疫总局中国，国家标准化管理委员会. 说明书的编制构成内容和表示方法[DB/OL]. 中国标准出版社，2005. 中兴通讯学院. 科技文档写作实务[M]. 人民邮电出版社, 2013. 周海银. 教学测量与评价[M]. 济南:山东大学出版社,2015:5. Albers M J, Mazur B. Content and Complexity: Information Design in Technical Communication[M]. L. Erlbaum Associates Inc. 2005. Azuma M, Coallier F, Garbajosa J. How to apply the Bloom taxonomy to software engineering[C]// Eleventh International Workshop on Software Technology and Engineering Practice. IEEE, 2003:117-122.Bachman L F. Fundamental considerations in language testing[J]. 1990, 75(4). Blythe S, Lauer C, Curran P G. Professional and technical communication in a web 2.0 world[J]. Technical Communication Quarterly, 2014, 23(4): 265-287. Brumberger E, Lauer C. The evolution of technical communication: An analysis of industry job postings[J]. Technical Communication, 2015, 62(4): 224-243. Carey M, Lanyi M F, Longo D, et al. Developing Quality Technical Information: A Handbook for Writers and Editors[M]. IBM Press, 2014. Carroll J. Minimalism beyond “The Nurnberg Funnel”.[J]. Computers & Human Interaction, 1998. Coe M. Human factors for technical communicators[M]. John Wiley & Sons, Inc. 1996. Donald A. Norman. Emotional Design[J]. Ubiquity, 2004, 2004(45):1-1. Cunningham D. Core competency skills for technical communicators[C]// Professional Communication Conference, 2008. IPCC 2008. IEEE International. IEEE, 2008:1-6. Gao, Z., Yu, J., & De Jong, M. (2014). Establishing technical communication as a professional discipline. Tcworld, 2014(08), 10–13. Glaser, R., & Klaus, D.J. (1962). Proficiency measurement: Assessing human performance. In R.M. Gagné (Ed.),Psychological principles in system development. New York: Holt, Rinehart and Winston. Hackos J A T. Managing Your Documentation Projects[M]. 1994. Hackos J A T, Redish J. User and task analysis for interface design[J]. 1998. Harvey, R. J. (1991). Job analysis. In M. Dunnette & L. Hough (Eds.), Handbook of industrial and organizational psychology (2nd ed., Vol. 2, pp. 71–163). Palo Alto, CA: Consulting Psychologists Press. Henze B, Miller C, Carradini S. Technical Communication[J]. 2016, BTR-7(3):7-7. Johnsonsheehan R. Technical Communication Today[M]// Technical communication today. Longman, 2010:256–260. Krathwohl D R. A revision of Bloom's Taxonomy: an overview - Benjamin S. Bloom, University of Chicago[J]. Theory Into Practice, 2002(Autumn). Mark R. Raymond. Job Analysis and the Specification of Content for Licensure and Certification Examinations[J]. Applied Measurement in Education, 2001, 14(4):369-415. Markel M. Technical Communication: Update 2002[M]. Boston: St. Martin's, 2002. McDowell E E. Certifying Technical Communicators in the 21st Century[J]. 2001. Nugent J. Certificate programs in technical writing: Through sophistic eyes[J]. Design discourse: Composing and revising programs in professional and technical writing, 2010: 153-170. Nugent J. A survey of US certificate programs in technical communication[J]. Programmatic Perspectives, 2013, 5(1): 58-85. O Hara F M. A brief history of technical communication[C]//ANNUAL CONFERENCE-SOCIETY FOR TECHNICAL COMMUNICATION. UNKNOWN, 2001, 48: 500-504. Pruitt J, Adlin T. The persona lifecycle: keeping people in mind throughout product design[M]. Elsevier, 2010. Rainey K T, Turner R K, Dayton D. Do curricula in technical communication jibe with managerial expectations? A report about core competencies[C]// Ipcc 2005. Proceedings. International Professional Communication Conference. IEEE, 2005:359-368. Roy K. Turner, Kenneth T. Rainey. Certification in Technical Communication[J]. Technical Communication Quarterly, 2004, 13(2):211-234. Rubin J, Chisnell D. Handbook of Usability Testing 2nd Edition[J]. Wiley Publishing Inc, 2008. Sauro J, Lewis J R. Quantifying the User Experience[M]. 2012. Spencer D. Practical Guide to Information Architecture[J]. 2010. Thompson I. Competence and critique in technical communication: A qualitative content analysis of journal articles[J]. Journal of Business and Technical Communication, 1996, 10(1): 48-80. Turner R K, Rainey K T. Certification in technical communication[J]. Technical communication quarterly, 2004, 13(2): 211-234. ﹀
公开日期：	2018-11-30

医学英语词汇学习系统研究与设计.荣岩

链接

题名：	医学英语词汇学习系统研究与设计
姓名：	荣岩
学号：	1501210657
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2018-11-30
外文题名：	Research and Design of a Medical English Vocabulary Learning System
关键词：	医学英语词汇词汇学习效率自适应推荐词汇记忆网词汇复现
外文关键词：	medical English vocabulary vocabulary learning efficiency adaptive recommendation vocabulary memory networks vocabulary repetition rate
论文摘要：	︿医学英语词汇与普通英语词汇不同，有其独特的构词方式，词素特征明显，词汇间的关联性更强。经调研，目前的医学英语词汇学习资源不能完全满足学习者的学习需求，教学资源以纸质材料为主，局限性较大，存在医学英语词汇学习效率不高、学习内容有限、学习者积极性不高等问题，缺乏有效的医学英语词汇教学体系。聚焦到医学英语词汇学习效率的问题，以下三个方面仍未得到有效解决。一、目前医学英语词汇教学忽视了学习者的个体差异，现有教学方式难以根据每位学习者的学习情况进行医学词汇学习与复习动态推荐。二、现有英语词汇学习软件未能充分挖掘医学英语词汇特征，未能把握医学英语词汇教学重点，教学流程不完全适用于医学英语词汇。三、医学英语词汇复现率较低,学习者记忆效果不佳。为解决上述问题，本研究以医学英语词汇教学理论和第二语言习得理论为依据，利用移动互联网优势，设计了一款医学英语词汇学习系统，提出以下三种提高医学英语词汇学习效率的方式。一、建立医学英语词汇自适应推荐模型，通过综合计算医学英语词汇的特征影响因子实现医学英语词汇动态推荐。二、依据医学英语词汇特点，精选教学内容模块，突出医学英语词汇教学重点，优化教学流程，构建医学英语词汇记忆网。三、多维度复现词汇，通过多种词汇复现方法，增加词汇的复现率。为了验证本研究设计的医学英语词汇学习系统的教学效果，本研究对北京某学校50名大二非英语专业学生进行了对照教学实验。实验表明，本研究设计的学习方式可有效促进医学英语词汇与词素的习得效果和保持效果，且可以提高学习者的猜词能力。本研究设计的医学英语词汇学习系统有效提高了医学英语词汇学习效率，缓解了纸质医学资源的局限性，补足了医学英语词汇课堂教授内容受限的短板，可满足学习者的个性化需求，并注重通过多样化学习方式培养学习者的学习兴趣与积极性，对医学英语词汇移动教学具有一定的参考价值。﹀
外文摘要：	︿ Different from common English vocabulary, Medical English vocabulary has specific ways of word formation, and the semantic association among them is stronger than common English vocabulary. Moreover, the morpheme feature of medical English vocabulary is distinct. By surveying medical English learners, the author finds out that current medical English vocabulary learning resources cannot fully meet their learning needs. The teaching resources are mainly paper materials which have lots of limitations. At present, there are many problems concerning medical English vocabulary learning, such as low learning efficiency, limited learning materials and low learning initiative, and there is a lack of an effective medical English vocabulary learning system. Focusing on the low learning efficiency of medical English vocabulary, the following three aspects have not been effectively solved. First, current medical English vocabulary teaching ignores individual differences of learners, and the existing teaching methods cannot dynamically recommend medical vocabulary according to each learner’s situation. Second, the existing English vocabulary learning softwares fail to fully utilize the features of medical English vocabulary and grasp the key points of medical English vocabulary teaching, whose teaching processes are not fully applicable to medical English vocabulary. Third, the repetition rate of medical English vocabulary is low, which causes poor learning effect. In order to solve the above problems, based on medical vocabulary teaching theory and second language acquisition theory, taking advantages of the mobile internet, this study designs a medical English vocabulary learning system and proposes the following three ways to improve medical English vocabulary learning efficiency. First, this study establishes an adaptive recommendation model for medical English vocabulary, and realizes dynamic recommendation of medical English vocabulary by comprehensively calculating the influencing factors of medical English vocabulary. Second, based on the distinct features of medical English vocabulary, this study designs appropriate teaching modules, highlights the teaching focus of medical English vocabulary, optimizes the teaching processes and builds medical English vocabulary memory networks. Third, this study designs various ways to increase the repetition rate of medical English vocabulary. In order to verify the teaching effect of the system, this study conducted a comparative experiment on 50 sophomores of non-English majors of a university in Beijing. The experiment result shows that the learning methods can effectively promote the acquisition and retention effect of medical English vocabulary and morphemes, and can promote learners’ ability of guessing words. The medical English vocabulary learning system effectively improves medical English vocabulary learning efficiency, alleviates the limitations of medical paper resources, complements limited teaching content of traditional medical English vocabulary lessons, meets learners’ personalized needs, and pays attention to cultivating learners' learning interest and initiative through diversified learning methods. The system has certain reference value for medical English vocabulary mobile teaching. ﹀
分类号：	H08
论文总页数：	82
参考文献总数：	66
参考文献列表：	︿ [1] Wilkins D A. Linguistics in language teaching [M]. London: Edward Arnold. 1972. [2] 张燕, 吴新炜, 张顺兴. 我国高等医学院校医学英语教学现状调查与分析[J]. 中国高等医学教育, 2006(8): 29-30. [3] 王连柱. 论高频医学词汇的筛选与医学英语教学[J].中国医学教育技术, 2011, 25(2): 217-220. [4] 刘萍, 刘座雄. 基于ESP语料库的学术英语词汇学习法的有效性研究[J]. 外语研究, 2018(3): 54-60. [5] Sinclair S, Renouf A. A lexical syllabus for language learning [M]. // Carter, R. & McCarthy, M. Vocabulary and language teaching. London and NewYork: Longman, 1988: 142-143. [6] Richard J C. A psycholinguistic measure of vocabulary selection [J]. Iral, 1969, 8(2):87-102. [7] O’ Gorman E. An investigation of the mental lexicon of second language learners [J]. The Irish yearbook of applied linguistics, 1996, (16):15-31. [8] 马雁. ESP理论视角下的医学英语课程设置及其教学探索[J]. 外语电化教学, 2009(1): 60-63. [9] 王国良. ESP还是EGP——普通医学院校大学生对医学英语教学看法的调查研究[J]. 中国医学教育技术, 2014(2): 215-220. [10] Strevens P. ESP after twenty years: A re-appraisal [A]. In M Tickoo (ed.). ESP: State of the Art [C]. Singapore: SEAMEO Regional Language Centre.1998. [11] Hutchinson T, Waters A. English for specific purposes: A learning-centered approach [M]. Cambridge: Cambridge University Press, 1998:1-10. [12] 丁青年. 医学英语与英语医学[J]. 上海中医药杂志, 2002(12): 40-41. [13] Nation I S P. Learning vocabulary in another language [M]. Cambridge: Cambridge University Press.2001. [14] Gylys B A, Wedding M E. Medical Terminology: A System Approach [M]. Philadelphia: F. A. Davis. 1983. [15] 沈姝. 从英语词源角度分析医学英语词汇特点[J]. 医学教育探索, 2007, 6(4):329-330. [16] Schmitt N, M McCarthy. Vocabulary: description, acquisition and pedagogy [M]. Cambridge: Cambridge University Press. 1997. [17] 陈琦, 高云. 学术英语中的半技术性词汇[J]. 外语教学, 2010, 31(6): 42-46. [18] 秦秀白. ESP的性质、范畴和教学原则[J]. 华南理工大学学报（社会科学版）, 2003, 5(4): 79-83. [19] 蔡基刚. ESP与我国大学英语教学发展方向[J]. 外语界, 2004, (2): 22-28. [20] 杨慧中. EAP在中国：回顾、现状与展望[R]. 中国ESP研究高端论坛. 北京外国语大学. 2010. [21] 华瑶. 医学英语核心词汇的筛选和教学[J]. 医学教育管理, 2016, 2: 36-38. [22] 李定均. 医学英语词汇学[M]. 上海: 复旦大学出版社. 2006. [23] 黄远振. 词的形态理据与词汇习得的相关性[J]. 外语教学与研究, 2001, 33(6): 430-435. [24] 李媛媛. 注意假说视角下词的形态理据对二语词汇习得的影响研究[D]. 扬州大学. 2017. [25] Yang M N. Nursing pre-professionals’ medical terminology learning strategies [J]. Asian EFL Journal, 2005, 7(1): 137-154. [26] Brown C, M E Payne. Five essential steps of processes in vocabulary learning [C]. Paper presented at the TESOL Convention, Baltimore, MD. 1994. [27] Richards J. The Role of Vocabulary Teaching [J]. TESOL Quarterly. 1976, 10(1): 77-89. [28] Sokmen A J. Word association results: a window to the lexicon of ESL students [J]. JALT Journal, 1993, 15(2): 135-150. [29] Wray A. Formulaic language and the lexicon [M]. Cambridge: Cambridge University Press. 2005. [30] Pitts M, White H, Krashen S. Acquiring second language vocabulary through reading: a replication of the clockwork orange study using second language acquirers [J]. Reading in a Foreign Language, 1989, 5(2), 271-275. [31] Nist S L, Olejnik S. The role of content and dictionary definitions on varying levels of word knowledge [J]. Reading research quarterly, 1995, 172-193. [32] Palmberg R. Computer games and foreign-language vocabulary learning [J]. Elt Journal, 1988.42(4): 247-252. [33] Laufer B. Corpus-based versus lexicographer examples in comprehension and production of new words [M]. // Fontenelle T. Practical Lexicography. Oxford: Oxford University Press. 2008: 71-76. [34] 赵海威. 基于行为特征和数据分析的外语词汇学习模型研究[D]. 北京大学. 2017. [35] Nation P, R Waring. Vocabulary size, text coverage and word lists [M]. In N Schmitt, M McCarthy. Vocabulary Description Acquisition Pedagogy. 1997. [36] West M. A general service list of English words [M]. London: Longman, 1953. [37] Chung T M, Nation P. Identifying technical vocabulary [J]. System, 2004, 32(2): 251-263. [38] Chujo K, Utiyama M. Selecting level-specific specialized vocabulary using statistical measures [J]. System, 2006, 34(2): 255-269. [39] Schmidt R. The role of consciousness in second language learning [J]. Applied linguistics, 1990, 11(2):37-41. [40] Ellis R. SLA research and language teaching [M]. Oxford: Oxford University Press. 1997. [41] Swain M. Three functions of output in second language learning [A]. In G Cook and B Seidlhofer (eds.). Principle and practice in applied linguistics [C]. Oxford: Oxford University Press. 1995. [42] Nation I S P. Teaching and Learning Vocabulary [M]. Boston: Heinle & Heinle Publishers. 1990. [43] Laufer B. The development of passive and active vocabulary in second language: Same or different? [J]. Applied linguistics, 1998, 19: 255-271. [44] Atkinson R C, R M Shiffrin. Human memory: A proposed system and its control process [J]. Psychology of learning and motivation, 1968, 2: 89-195. [45] Craik F I M, R S Lockhart. Levels of processing: A framework for memory research [J]. Journal of verbal learning and verbal behavior, 1972, 11(6): 671-684. [46] 张庆宗, 吴喜燕. 认知加工层次与外语词汇学习——词汇认知直接学习法[J]. 现代外语, 2002, 25(2):176-186. [47] Nunan D. Language teaching methodology [M]. London: Prentice Hall International Ltd. 1991. [48] 张烨, 邢敏, 周大军. 非英语专业本科生英语词汇学习策略的调查[J]. 解放军外国语学院学报, 2003, 26(4): 44-48. [49] Craik F I M, E Turlving. Depth of processing and the retention of words in episodic memory [J]. Journal of experimental psychology, 1975, 104(3): 268-294. [50] Laufer B, J Hulstijn. Incidental vocabulary acquisition in a second language: The construct of task-induced involvement [J]. Applied linguistics, 2001, 22(1): 1-26. [51] Collins A M, M R Quillian. Retrieval time for semantic memory [J]. Journal of verbal learning and verbal behavior, 1969, 8(2): 240-247. [52] Smith E E, E J Shoben, L J Rips. Structure and process in semantic memory: A featural model for semantic decisions [J]. Psychological review, 1974, 81(3): 214-241. [53] Collins A M, E F Loftus. A spreading-activation theory of semantic processing [J]. Psychological review, 1975, 52(6): 407-428. [54] 李晓丽. 中国英语学习者心理词库中的二语语义网络探究[J]. 牡丹江大学学报, 2017, 26(2): 112-117. [55] 陈仕品, 张剑平. 适应性学习支持系统的学生模型研究[J]. 中国电化教育, 2010, (5): 112-117. [56] Chen C M, C J Chung. Personalized mobile English vocabulary learning system based on item response theory and learning memory cycle [J]. Computer & Education, 2008, 51(2):624-645. [57] Jaeyoung J, S Graf. An approach for personalized web-based vocabulary learning through word association games [C]. International Symposium on Applications and the Internet, 2008: 325-328. [58] 孙明庆. 基于模糊逻辑的自适应学习系统的研究与实现——以高中英语词汇为例[D]. 湖北大学. 2017. [59] 赵艳平. 高中英语词汇自适应学习系统的设计与开发[D]. 山东师范大学. 2015. [60] Coffey B. State of the art article -- ESP: English for specific purposes [J]. Language teaching, 1984, 17(1): 2-16. [61] Robinson P. ESP today: A practitioner’s guide [M]. New York & London: Prentice Hall International (UK) Ltd. 1991. [62] O' Malley J, A Chamot. Learning strategies in second language acquisition [M]. Cambridge: Cambridge University Press. 1990. [63] 章国英, 方卫, 李平. 医学英语听力课程主题网站的建设与实践[J]. 中国医学教育技术, 2006, 20(3): 197-199. [64] 李红, 田秋香. 第二语言词汇附带习得研究[J]. 外语教学, 2005, 26(3): 52-56. [65] 赵秀红, 聂建中. 合理删词完形填空与阅读能力的关系研究[J]. 教育理论与实践, 2010, 30(4): 56-58. [66] Lee J J, Hammer J. Gamification in education: What, how, why bother? Academic exchange quarterly, 2011, 15(2): 1-5. ﹀
公开日期：	2018-11-30

基于多模态理论和图式理论的雅思听说学习系统的研究与设计.周璇

链接

题名：	基于多模态理论和图式理论的雅思听说学习系统的研究与设计
姓名：	周璇
学号：	1501210821
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	张宏岩
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2018-11-30
外文题名：	Research and Design of IELTS Listening and IELTS Speaking Preparation Application Based on Schema Theory and Multimodality Theory
关键词：	雅思听力雅思口语图式理论多模态理论
外文关键词：	IELTS Listening IELTS Speaking Schema Theory Multimodality Theory
论文摘要：	︿ “雅思考试”是为准备到以英语为交流语言的国家学习、就业或定居的人们设置的一项英语语言水平测试，包含听、说、读、写四个部分。本文集中研究学术考试中的听说部分。随着越来越多的学生选择赴国外求学，雅思考试的热度不断攀升。市面上，各种各样的雅思备考软件也应运而生，以期帮助学生备考雅思。然而，大多数软件只是作为一个题目资源库而存在，仅注重题目练习，忽视了对学生英语听力理解能力和英语口语表达能力的提升。在备考雅思听说考试过程中，学生也遇到了诸多困难，往往练习了很多套真题，但是成绩依旧未能提高。究其原因，是因为学生只是一味地盲目刷题对答案，遇到的种种困难未能得到解决。从题目练习整个过程来看，学生遇到的困难主要有以下几点：一、做题前，学生未能获得足够多的可理解性输入。二、做题中，未能准确掌握雅思题目的答题技巧。三、做题后，学生未能获得足够的反馈信息；未能及时对错误进行总结分析，并针对性地安排练习；未能接受针对雅思听说考试的进一步技能提升训练。本系统针对雅思听力和雅思口语的考试特点，从学生遇到的困难出发，评析当前相关教学系统，基于多模态理论和图式理论，结合相关教学实践和移动学习的特点，研究与设计了雅思听说学习系统。雅思听力学习系统中，做题前安排听力词汇和同义替换的学习与测试，做题后多模态方式展示听力原文，提供听写练习和听力原文学习。雅思口语学习系统中，做题前安排口语词汇的学习与测试和复杂句型学习，做题中，安排答题技巧学习，做题后依照答题框架多模态展示范文，提供雅思范文学习。基于以上设计，本文选取本系统中设计的学习方案与以往学习系统中的学习方案进行对比，通过实验、调查问卷和数据分析等方式在认知负荷和学习目标达成情况方面对本系统提出的学习方案进行了验证，证明了本系统设计的学习方案在认知负荷相似的情况下，更有利于学生达成学习目标。本文设计的系统，做题前，帮助学生获得足够的可理解性输入；做题中，建立和强化答题技巧图式；做题后，解决存在的错误和问题，帮助学生获得足够的反馈和技能提升训练，有助于增强学生对于知识的内化程度，帮助学生形成一个良性的做题循环，发挥每一套真题的价值，达到在题目练习过程中逐步提升成绩的同时，真正提升英语听力理解能力和口语表达能力。﹀
外文摘要：	︿ The International English Language Testing System (IELTS) is the world’s most popular English language proficiency test for higher education and global migration, which assesses all English skills including reading, writing, listening and speaking. This paper focuses on the listening and speaking part of IELTS Academic. As more and more students choose to study abroad, the IELTS test is becoming increasingly popular. As a result, a variety of IELTS preparation systems have been developed to help students prepare for IELTS test. However, most existing systems only work as a repository with a focus on taking IELTS exercises, which ignores the cultivation of students’ listening ability and speaking ability. In the process of preparing for the exam, the students also encountered many difficulties. Usually, they have done a lot of exercises, but the results still failed to improve. The reason is because the students just blindly do exercises and check answers. And the difficulties encountered were not solved. From the point of view of the process of doing exercises, the students have the following difficulties: First, before doing exercises, students failed to obtain enough comprehensible input. Second, when doing exercises, students have not accurately grasped the answering technique of the IELTS. Third, after exercises, except insufficient feedback, error analysis and corresponding exercises, they failed to accept further skills training for the IELTS listening and speaking test. Based on Chinese students’ problems in preparation for the IELTS listening and IELTS speaking test, the analyses of the existing systems and the characteristics of IELTS listening and IELTS speaking test, this system is designed on the basis of Multimodality Theory and Schema Theory, combined with relevant teaching practices and the characteristics of mobile learning. In the IELTS listening learning system, listening vocabulary and synonyms are studied and tested before exercises. After exercises, the transcript is displayed in a multi-modal way, and dictation exercises and the learning of transcripts are provided. In the IELTS speaking learning system, the learning and testing of spoken vocabulary and the learning of complex sentence patterns are arranged before exercises. When doing exercises, the learning of answer techniques are provided. After exercises, the modal essay is displayed in accordance with the answer frame and in a multi-modal way, and the learning of these modal essays are provided. Based on the above design, this paper compares the learning scheme of this system with those of the existing systems. Through experiments, questionnaires and data analysis, the learning schemes proposed by the system was verified in terms of cognitive load and learning goal achievement. It is proved that the cognitive load of the learning scheme of this system designed in this paper is similar to those of the existing systems, but it is more conducive to students' achievement of learning goals. This system helps the students obtain enough comprehensible input before doing exercises, create and strengthen the answering technique patterns when doing exercises, solve the existing errors and problems, obtain sufficient feedback as well as further skill training after exercises. As a result, this system could enhance students’ internalization of knowledge, help students form a virtuous cycle of doing exercises, and thus gradually improve the performance of English listening, and speaking while improving students’ grades in IELTS test. ﹀
分类号：	G43
论文总页数：	111
参考文献总数：	85
参考文献列表：	︿白丽. 2015. 心理信息加工模式下雅思听力教学内容的研究[硕士学位论文]. 哈尔滨师范大学. 曹怡鲁. 1999. 外语教学应借鉴中国传统语言教学经验[J]. 外语界, 2: 17. 曹治. 2017. 多模态视角下大学英语口语教学模式的实证研究[硕士学位论文]. 西安外国语大学. 崔旻, 周春芳. 2015. 多媒体呈现方式在外语词汇直接学习中的效果研究. 解放军外国语学院学报, 38(03): 88-95. 董卫, 付黎旭. 2003. 背诵式语言输入在大学英语教学中的作用. 外语界, 04: 56-59. 范琳, 王庆华. 2002. 英语词汇学习中的分类组织策略实验研究[J]. 外语教学与研究, 03: 209-212. 范琳, 王震. 2014. 词汇重复模式理论与基于语篇语境线索的词汇推理策略. 山东外语教学, 35(05): 54-60. 郭纯洁. 2007. 有声思维法. 北京:外教学与研究出版社. 顾曰国. 2007. 多媒体、多模态学习剖析. 外语电化教学, 02: 3-12. 黄荣怀, Jyri Salomaa. 2008. 移动学习——理论·现状·趋势. 北京: 科学出版社. 侯云红. 2013. 大学英语课堂复合式听写练习对听力水平的作用[硕士学位论文]. 延边大学 . 何蓉. 2011. 关于雅思口语考试第三部分若干解决方案的探讨. 西南民族大学学报(人文社会科学版), 32(S2): 174-176. 胡永近, 张德禄. 2013. 英语专业听力教学中多模态功能的实验研究. 外语界, 05: 20-25+44. 胡壮麟. 2007. 社会符号学研究中的多模态化. 语言教学与研究, 01: 1-10. 贾冠杰. 2006. 二语习得论. 南京: 东南大学出版社. 李传益.2014. 复述式语言输入对英语听说能力有效性实证研究[J]. 当代外语研究, 07: 44-49. 龙宇飞, 赵璞.2009. 大学英语听力教学中元认知策略与多模态交互研究[J]. 外语电化教学, 04: 58-62+74. 骆雁雁.2009. 基于语块理论的大学英语词汇教学模式研究[J]. 外语学刊, 06: 168-170. 毛佳玳, 蔡慧萍.2016. 基于语类的大学英语口语教学模式应用研究[J]. 外语界, 03: 89-96. 戚焱, 蒋玉梅, 朱雪媛. 2015. 大学英语口语教学中词块教学法的有效性研究. 现代外语, 38(06): 802-812+873-874. 孙燕. 2013. 雅思听力考试应试策略. 海外英语, 04: 85-86. 束定芳, 庄智象. 1996. 现代外语教学一理论、实践与方法. 上海: 上海外语教育出版社. 文秋芳. 2008. 输出驱动假设与英语专业技能课程改革. 外语界, 02: 2-9. 文秋芳. 2013. 输出驱动假设在大学英语教学中的应用: 思考与建议. 外语界, 06: 14-22. 文秋芳. 1995. 英语学习策略论. 上海: 上海外语教育出版社 . 王家义. 2012. 基于语料库的英语词汇教学: 理据与应用. 外语学刊, 04: 127-130. 王丽. 2007. 三种大规模标准化英语考试听力测试部分之比较:——一项基于语篇、任务、说话人相关因素的研究. 外语电化教学, 02: 67-72. 汪梅. 2016. 图式理论在高中英语词汇教学中的应用研究[硕士学位论文]. 上海师范大学. 王巍. 2010. 图式理论在高中英语词汇教学的应用研究[硕士学位论文]. 东北师范大学 . 武晶晶. 2013. 朗读在高职非英语专业英语听力教学中的应用[硕士学位论文]. 湖北大学. 吴延国. 2011. 《二语研究中的有声思维法争议》评述. 外语界, 4:93-96. 徐冉. 2017. 最佳教学实践指导下的英语词汇学习系统前端设计与实现[硕士学位论文]. 北京大学. 杨超. 2017. 最佳教学实践指导下的英语听力学习系统的前端设计与实现[硕士学位论文]. 北京大学. 杨映春.2013. 基于图式理论的专业英语听力教学模式实验研究. 广东外语外贸大学学报, 24(05): 96-100. 叶家春, 曾杰. 2016. 英语词汇教学的多模态—认知策略模式. 教育评论, 08: 127-130. 张德禄. 2009. 多模态话语分析综合理论框架探索. 中国外语, 6(01): 24-30. 张彤彤. 2016. 中外合作办学项目的雅思口语教学研究——基于图式理论的教学法初探. 海外英语, 06: 48-50. 张燕燕. 2015. 基于图式理论的英语口语教学模式探析. 求索, 11: 189-192. 张烨, 邢敏, 周大军. 2003. 非英语专业本科生英语词汇学习策略的调查. 解放军外国语学院学报, 04: 44-48. 朱湘华. 2010. 大学英语听力策略训练模式与效果分析. 外语研究, 02: 53-58. 朱永生. 2007. 多模态话语分析的理论基础与研究方法. 外语学刊, 05. 周相利. 2002. 图式理论在英语听力教学中的应用. 外语与外语教学, 10: 24-26 Brown, H. D. 1994. Teaching by Principles: An Interactive Approach to Language Teaching[M]. Englewood Cliff, NJ: Prentice Hall. Bhatia, V. K. 2014. Analysing genre: Language use in professional settings. Routledge. Carrell, P. L., & Eisterhold, J. C. 1983. Schema theory and ESL reading pedagogy[J]. TESOL quarterly, 17(4), 553-573. Chamot, A. U. 1988. A study of learning strategies in foreign language instruction: Findings of the longitudinal study. Cohen, A.D. 1998. Strategies in Learning and Using a Second Language [M]. London: Longman, Cook, G. 1989. Discourse[M]. Oxford : Oxford University Press. Duncker, K., & Lees, L. S. 1945. On problem-solving. Psychological monographs, 58(5), i. Eggins, S. 1994. An introduction to systemic functional linguistics[M]. London: Printer. Ericsson, K. A.& Simon, H. A. 1984. Protocol Analysis: Verbal Reports as Data. Cambridge: The MIT Press. Faerch, C., & Kasper, G. 1987. Introspection in second language research (Vol. 30). Multilingual Matters Limited. Flowerdew, J. 1993. An educational, or process, approach to the teaching of professional genres. ELT journal, 47(4), 305-316. Forceville, C. 2009. Non-verbal and multimodal metaphor in a cognitivist framework: Agendas for research[A]. In Forceville, C. & E. Urios-Aparisi (eds.). Multimodal Metaphor-Application of Cognitive Linguistics[C]. New York: Mouton de Gruyter. Gerjets, P., Scheiter, K., & Catrambone, R. 2004. Designing instructional examples to reduce intrinsic cognitive load: Molar versus modular presentation of solution procedures. Instructional Science, 32(1-2), 33-58. Gough, P. B., Juel, C., & Griffith, P. L. 1992. Reading, spelling, and the orthographic cipher. Reading acquisition, 35-48. Halliday, M.A.K. 1985. An Introduction to Functional Grammar[M]. London: Edward Arnld Harmer, J. 1983. The practice of English language teaching. Longman, 1560 Broadway, New York, NY 10036. Hasan, R. 1978. Text in the systemic-functional model. Current trends in textlinguistics, 2, 229-45. Johnson, D. W., & Johnson, R. T. 1989. Cooperative learning: What special education teachers need to know. The Pointer, 33(2), 5-11. Kalyuga, S., Chandler, P., & Sweller, J. 1999. Managing split‐attention and redundancy in multimedia instruction. Applied Cognitive Psychology: The Official Journal of the Society for Applied Research in Memory and Cognition, 13(4), 351-371. Kester, L., Lehnen, C., Van Gerven, P. W., & Kirschner, P. A. 2006. Just-in-time, schematic supportive information presentation during cognitive skill acquisition. Computers in Human Behavior, 22(1), 93-112. Krashen, S. D. 1985. The Input Hypothesis: Issues and Implication[M]. London: Longman. Kress, G. 2001. Sociolinguistics and social semiotics[A]. In Cobley, P.(ed.) The Routledge Companion to Semiotics and Linguistics[C]. London and New York: Routledge. Larsen-Freeman D. 2005. Teaching Language: From Grammar to Grammaring[M]. Beijing: Foreign Language Teaching and Research Press. Lee, H., Plass, J. L., & Homer, B. D. 2006. Optimizing cognitive load for learning from computer-based science simulations. Journal of educational psychology, 98(4), 902. O'Malley, M. J., & Chamot, A. U. 1990. Learning strategies in second language acquisition. Cambridge university press. Oxford, R. 1990. Language learning strategies. New York, 3. Paas, F. G., Van Merriënboer, J. J., & Adam, J. J. 1994. Measurement of cognitive load in instructional research. Perceptual and motor skills, 79(1), 419-430. Pennycook, A. 1996. Borrowing others' words: Text, ownership, memory, and plagiarism. TESOL quarterly, 30(2), 201-230. Pollock, E., Chandler, P., & Sweller, J. 2002. Assimilating complex information. Learning and instruction, 12(1), 61-86. Richards J. 2006. Second Language Listening: Theory and Practice[M]. Cambridge: Cambridge University Press． Royce, T. 2002. Multimodality in the TESOL classroom: Exploring visual‐verbal synergy. TESOL quarterly, 36(2), 191-205. Rumelhart, D.E. 1980. Schemata: the building blocks of cognition. In: R.J. Spiro etal. (eds) Theoretical Issues in Reading Comprehension[C], Hillsdale, NJ: Lawrence Erlbaum. Schmidt, R. 1990. The Role of Consciousness in Second Language Learning[J]. Applied Linguistics, 11( 2): 129 -158. Skehan, P. 1998. Individual Differences in Second Language Learning [M]. London: Edward Arnold. Stein, P. 2000. Rethinking Resources: Multimodal Pedagogies in the ESL Classroom[J]．TESOL Quarterly, (34):333-336. Swain, M. 1985. Communicative competence; some roles of comprehensible input and comprehensible output in its development [A]. In S. M. Gass & C. G. Madden (eds.). Input in Second Language Acquisition. Rowley [C]. MA: Newbury House. Swain, M. 1993. The hypothesis: Just speaking and writing aren't enough [J]. The Canadian Modern Language Review 50:158-164. Swain, M. 1995. Three functions of output in second language learning [A]. In G. Cook. & B. Seidlhofer (eds.). Principle and Practice in Applied Linguistics [C]. Oxford: Oxford University Press. Sweller, J. 1988. Cognitive load during problem solving: Effects on learning. Cognitive science, 12(2), 257-285. Tarmizi, R. A., & Sweller, J. 1988. Guidance during mathematical problem solving. Journal of educational psychology, 80(4), 424. The New London Group. 1996. A pedagogy of multiliteracies: Designing social futures. Harvard educational review, 66(1), 60-93. Underwood, M. 1990, Teaching Listening[M]. New York: Longman. Van Merriënboer, J. J., Kirschner, P. A., & Kester, L. 2003. Taking the load off a learner's mind: Instructional design for complex learning. Educational psychologist, 38(1), 5-13. ﹀
公开日期：	2018-11-30

基于模拟方法的技术写作同源开发教学研究.杨爱萍

链接

题名：	基于模拟方法的技术写作同源开发教学研究
姓名：	杨爱萍
学号：	1501210755
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2018-11-30
外文题名：	Research on Single Sourcing Teaching Based on Simulation-based Method
关键词：	技术写作同源开发教学模拟方法教学设计
外文关键词：	Technical writing Single sourcing teaching Simulation-based teaching method Instructional design
论文摘要：	︿技术文档开发需求日益增长，同源开发方法应运而生。作为一种文档开发方法论，同源开发强调了通过内容模块化实现文档系统性复用的重要思想。同源开发是北京大学技术传播教学体系中的重要组成部分，教学目标以掌握其基本原理、思路及流程为核心，帮助学生完成线性文档的拆解与模块化文档的生成。总结近几年的教学经验，学生在掌握同源开发上仍存在三项突出问题：1) 主题识别困难，识别结果准确度低；2) 写作过程中的技术原理掌握不到位；3) 主题组织条理不清，文档架构混乱。这些问题与同源开发学习内容、教学方式及教学工具有着密不可分的关系。本文通过调查与访谈进一步探究教学问题的症结，总结学习者及行业需求，调研现有教学工具，创新性地设计了CCMS教学模拟器SuperEasyDITA，填补同源开发教学工具的空白，并基于该工具，对症下药展开教学方法设计，具体采用了：1) 主题逆向拆解匹配、正向分析识别的双向教学模式，解决主题识别困难；2) 写作方式难度渐进、写作过程技术提醒、写作成品切换修改，提升技术原理掌握程度；3) 学生自主构建情境、同伴协作讨论架构，优化文档组织架构思路与方法；4) 引导性反馈与讨论，加深对同源开发原理、思路与流程的整体理解，深化各过程学习要点。为验证教学方法的有效性，本研究依托北京大学2017、2018级技术传播专业课程选修学生开展了教学实验，其中，实验组采用基于模拟器的教学方法，对照组采用传统教学方法。研究结果表明，教学模拟器SuperEasyDITA满足了文档同源开发基本功能需求，在教学有用性、易用性及创新性上都较传统教学工具有显著优势；基于模拟器的教学方法有助于学生掌握XML相关技术原理，提升写作过程中技术原理的掌握程度；有助于学生识别主题类型，提升主题识别准确度；有助于学生组织主题，改善文档组织架构；在整体教学效果上提升教学效率的同时解决了教学问题，能真正有效帮助学生更好地掌握同源开发知识体系。﹀
外文摘要：	︿ The demand for technical documents is growing, and the document development methodology called single sourcing comes into being. Single sourcing emphasizes the important idea of systematic reuse of documents through content modularization. Single sourcing teaching is an important part of Peking University's technical communication curriculum. The teaching objectives focus on its basic principles, ideas and processes, and help students to complete the disassembly of linear documents and the generation of modular documents. After summarizing and analysing the teaching experience in recent years, this paper finds that students still have some problems in learning single sourcing: 1) difficulty in identifying the topic type and the recognition accuracy is low; 2) difficulty in understanding the XML related technical knowledge; 3) difficulty in organizing the document structure and the structure is unclear. These problems are inextricably linked to the single sourcing learning content, teaching methods and teaching tools. This paper explores the crux of above problems through surveys and interviews, summarizes learners and industry needs, analyzes existing teaching tools, and innovatively designs the CCMS teaching simulator called SuperEasyDITA to make up for the lack of single sourcing teaching tool. Based on SuperEasyDITA, this paper designs the teaching method aimed at solving current problem. Specifically, it adopts: 1) topic disassembly and matching from standard modular document, and topic analysis and recognition from liner document, which solves the problem of topic recognition; 2) different writing modes with gradual difficulty, instant technical knowledge reminding during writing process, and observation, switch and modification of final document, which improves the mastery of technical knowledge; 3) document use situation construction and collaborative discussion, which strengthens document organizing ideas and methods; 4) instructive feedback and discussion, which deepens the understanding of overall ideas and processes. In order to verify the effectiveness of this method, this study relies on the students of the 2017 and 2018 technical communication courses of Peking University to carry out the teaching experiments. Among them, the experimental group adopts the simulator-based teaching method and the control group adopts the traditional teaching method. The research results show that the teaching simulator SuperEasyDITA satisfies the basic functional requirements of single sourcing, and has significant advantages in teaching usefulness, ease of use and innovation compared with traditional teaching tools. The simulator-based teaching method helps students to improve the topic recognition accuracy; helps students to understand the XML and related technical knowledge in the writing process; helps students to organize topics and improve document organization; improves the teaching efficiency and at the same time optimizes the teaching process, solves the teaching problems, and can effectively help students to better master the knowledge of single sourcing. ﹀
分类号：	H08
论文总页数：	95
参考文献总数：	69
参考文献列表：	︿安德森, 皮连生. 学习、教学和评估的分类学[M]. 华东师范大学出版社, 2008. 褚慧玲. 基于学校教学常规考试的试卷命制技术[J]. 考试研究, 2008(4):81-92. 费丽嫚. 情景模拟器的设计与实现[硕士学位论文]. 上海:华东师范大学, 2015. 何克抗, 林君芬, 张文兰. 教学系统设计[M]. 高等教育出版社, 2016. 何克抗. 建构主义的教学模式、教学方法与教学设计[J]. 北京师范大学学报(社会科学版), 1997(5). 胡迎春, 广西壮族自治区教育厅组织编写. 职业教育教学法[M]. 华东师范大学出版社, 2010. 金瑞华, 刘春凤, 罗丹. 高仿真模拟教学中引导性反馈的应用进展[J]. 中国高等医学教育, 2017(5):95-97. 李向东, 卢双盈. 职业教育学新编[M]. 高等教育出版社, 2005. 刘晓瑜. 标准参照考试的若干理论与质量分析方法[J]. 华南师范大学学报(社会科学版), 1996(6):69-74. 李双燕. 2015年中国技术写作发展现状调查报告[C]// 中国科协年会. 2015. 李玮. 情景模拟教学法对管理学教学的启示[J]. 教育探索, 2008(7):63-64. 向梅梅, 刘明贵. 应用型本科高校实践教学研究[M]. 暨南大学出版社, 2011. 余文森. 有效教学的理论和模式[M]. 福建教育出版社, 2011. 张军征. 多媒体教学软件设计原理与方法[M]. 科学出版社, 2007. 张建伟. 基于模拟式教学及其效果研究回顾[J]. 电化教育研究, 2001(7):68-71. 张伟远. 网上学习环境评价模型、指标体系及测评量表的设计与开发[J]. 中国电化教育, 2004(7):29-33. 佐藤正夫. 教学论原理[M]. 人民教育出版社, 1996. Abel, S. In search of professional-grade content marketing. [EB/OL] (2013-07-29) [2018-04-09].http://www.thecontentwrangler.com/2013/07/29/in-search-of-professional-grade-content-marketing/. Albers M. Single Sourcing and the Technical Communication Career Path [J]. Technical Communication, 2003, 50(3):335-343. Ament, K. Single sourcing: Building Modular Documentation [M]. William Andrew, 2002. Andersen R, Batova T. The Current State of Component Content Management: An Integrative Literature Review [J]. IEEE Transactions on Professional Communication, 2016, 58(3):247-270. Batova T, Andersen R. A Systematic Literature Review of Changes in Roles/Skills in Component Content Management Environments and Implications for Education [J]. Technical Communication Quarterly, 2017, 26(2). Batova T, Andersen R, Evia C, et al. Incorporating Component Content Management and Content Strategy into Technical Communication Curricula[C]// Acm International Conference on the Design of Communication. ACM, 2016. Bellamy L. DITA Best Practices [J]. Addison-Wesley Longman, Amsterdam, 2011. Benson R, Brack C. Developing the scholarship of teaching: what is the role of e-teaching and learning? [J]. Teaching in Higher Education, 2009, 14(1):71-80. Bell B S, Kanar A M, Kozlowski S W J. Current issues and future directions in simulation-based training in North America [J]. The International Journal of Human Resource Management, 2008, 19(8):1416-1434. Carlsen, DD. Use of a Microcomputer Simulation and Conceptual Change Text to Overcome Student Preconceptions about Electric Circuits [J]. Journal of Computer-Based Instruction, 1992, 19(4):105-109. Carter L. The Implications of Single Sourcing for Writers and Writing [J]. Technical Communication, 2003, 50(3):317-320. Carrington, N. Teaching students to learn unfamiliar technology [J]. Programmatic Perspectives, 2015, 2(7), 230-250. Chambers, S. K., Haselhuhn, C., Andre, T., Mayberry, C., Wellington, S., Krafka, A., & Berger, J. The acquisition of a scientific understanding of electricity: Hands-on versus computer simulation experience; conceptual change versus didactic text [J]. In Annual Meeting of the American Educational Research Association, New Orleans, LA, 1994. Chronister C, Brown D. Comparison of Simulation Debriefing Methods [J]. Clinical Simulation in Nursing, 2012, 8(7):e281-e288. Costabile M F, Marsico M D, Lanzilotti R, et al. On the Usability Evaluation of E-Learning Applications[C]// Hawaii International Conference on System Sciences. IEEE Computer Society, 2005. Cooper A., Reimann, R., & Dubberly, H. About Face 2.0: The Essentials of Interaction Design [C]// John Wiley & Sons, Inc. 2007. Decker S, Fey M, Sideras S, et al. Standards of Best Practice: Simulation Standard VI: The Debriefing Process [J]. Clinical Simulation in Nursing, 2013, 9(6):S26-S29. Dekkers J, Donatti S, Dekkers J, et al. The Integration of Research Studies on the Use of Simulation as an Instructional Strategy [J]. Journal of Educational Research, 1981, 74(6):424-427. Dicheva D, Dichev C. Gamification in Education: Where Are We in 2015? [C]//E-Learn: World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education. Association for the Advancement of Computing in Education (AACE), 2015: 1445-1454. Dreifuerst, K. T. The essentials of debriefing in simulation learning: a concept analysis [J]. Nursing Education Perspectives, 2009, 30(2):109-114. Doherty S. Leveraging industry onboarding materials in the curriculum[C]// Acm International Conference on the Design of Communication. ACM, 2017. Dzida W, Freitag R. Making use of scenarios for validating analysis and design [J]. IEEE Transactions on Software Engineering, 2002, 24(12):1182-1196. Eble M F. Content vs. Product: The Effects of Single Sourcing on the Teaching of Technical Communication [J]. Technical Communication, 2003, 50(3):344-349. Evans, R. Teaching Single Sourcing To Bridge the Gap between Classrooms and Industry. [EB/OL] (2013-09-05) [2018-04-09]. https://www.writingassist.com/newsroom/teaching-single-sourcing/ Grimes P W, Willey T E. The effectiveness of microcomputer simulations in the principles of economics course [J]. Computers & Education, 1990, 14(1):81-86. Groom J A, Henderson D, Sittner B J. National League for Nursing Jeffries Simulation Framework State of the Science Project: Simulation Design Characteristics [J]. Clinical Simulation in Nursing, 2014, 10 (7), 337–344. Hanson AJ, Lindahl P, Strasser SD, Takemura AF, Englund DR. Technical Communication Instruction for Graduate Students: The Communication Lab vs. a Course [J]. 2017 ASEE Annual Conference & Exposition, 27. Hart-Davidson W. On Writing, Technical Communication, and Information Technology: The Core Competencies of Technical Communication [J]. Technical Communication, 2001, 48(2):145-155. Henschel, S. M. Authoring content for reuse: A study of methods and strategies, past and present, and current implementation in the technical communication curriculum [D]. 2010, Lubbock, TX: Texas Tech University. Hovde M R, Renguette C C. Technological Literacy: A Framework for Teaching Technical Communication Software Tools [J]. Technical Communication Quarterly, 2017, 26(2). Kolb D A, Boyatzis R E, Mainemelis C. Experiential Learning Theory: Previous Research and New Directions [J]. 2001. Kulik J A. Effects of Computer-Based Teaching on Secondary School Students [J]. Journal of Educational Psychology, 1983, 75(1):19-26. Lave J, Wenger E. Situated learning: legitimate peripheral participation [J]. 状況に埋め込まれた学習:正統的周辺参加, 1991, 29(2):167-182. Lee J. Effectiveness of computer-based instructional simulation: A meta-analysis [J]. International Journal of Instructional Media, 1999, 26(1):71-85. Mariani B, Cantrell M A, Meakim C. Nurse educators' perceptions about structured debriefing in clinical simulation [J]. Nursing Education Perspectives, 2014, 35(5):330-331. Mcshane, B. J. How to teach xml: a brief tutorial [J]. Intercom, 2007, 54, 20-39. McDaniel, R., & Steward, S. Technical communication pedagogy and the broadband divide: Academic and industrial perspectives [J]. Complex worlds: Digital culture, rhetoric, and professional communication, 2011, 195-212. Papert, S. Situating Constructionism [A]. In I. Harel, & S. Papert (Eds.). Constructionism: Research Reports and Essays 1985-1990 [C]. Norwood, N.J.: Ablex Publishing Corporation. 1991, 1-11. Price, R. M., Denise S P, Joel K A, et al. Observing populations and testing predictions about genetic drift in a computer simulation improves college students’ conceptual understanding [J]. Evolution Education & Outreach, 2016, 9(1):8. Pruitt, John, Adlin, et al. The Persona Lifecycle [M]. 2006. Rentroia-Bonito M A, Jorge J A P. An Integrated Courseware Usability Evaluation Method[C]// International Conference on Knowledge-based Intelligent Information. 2003, 2774, 208-214. Robidoux, Charlotte. Rhetorically Structured Content: Developing a Collaborative Single-Sourcing Curriculum [J]. Technical Communication Quarterly, 2007, 17(1):110-135. Robidoux, C., & Waychoff, P. CMS solutions: Knowing the right stuff [J]. Best Practices, Center for Information-Management Development, 2005a, 7, 86–89. Rockley A, Cooper C. Managing Enterprise Content [M]. New Riders, 2012. Rockley A. The Impact of Single Sourcing and Technology [J]. Technical Communication, 2001, 48(2):189-193. Rush Hovde, M., Renguette C C. Technological Literacy: A Framework for Teaching Technical Communication Software Tools [J]. Technical Communication Quarterly, 2017, 26(2), 395-411. Salas E, Wildman J, Piccolo R. Using simulation-based training to enhance management education [J]. Academy of Management Learning & Education, 2009, 8(4):559-573. Sapienza, F. Does being technical matter? xml, single source, and technical communication [J]. Journal of Technical Writing & Communication, 2002, 32(2), 155-170. Schertler M. E-Teaching Scenarios [J]. Virtual Technologies Concepts Methodologies Tools & Applications, 2008. Self T. The DITA Style Guide: Best Practices for Authors [M]. Scriptorium Publishing Services, Incorporated, 2011. Thomas R, Hooper E. Simulations: An opportunity we are missing [J]. Journal of Research on Computing in Education, 1991, 23:497-513. Young MF. Instructional design for situated learning [J]. Educational Technology Research and Development, 1993, 41(1):43-58. ﹀
公开日期：	2018-11-30

2018-06-06

指称理论对于生成语法的必要性.张振宝

链接

题名：	指称理论对于生成语法的必要性
姓名：	张振宝
学号：	1401213083
论文语种：	eng
专业：	文学 - 外国语言文学 - 外国语言学及应用语言学
公开时间：	1年后
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2018-06-06
外文题名：	On the Position of Reference Theory in Generative Grammar
关键词：	生成语法指称满足概念必要性完全解释原则
外文关键词：	Generative Grammar reference notion of satisfaction necessity FI principle
论文摘要：	︿摘要索绪尔认为语言是一个结构系统，能指和所指是语言符号互补的两个方面。弗雷格在对意义和指称进行区分的基础上，认为语言符号表达意义，指称个体的人或事物。乔姆斯基在处理语言的语义问题时,曾多次否认指称是人类语言系统的组成部分。但通过对生成语言学发展历程的梳理，本文发现，指称对于该语言学理论具有十分重要的作用。本文首先对指称理论进行讨论，梳理了弗雷格、罗素和斯特劳森三位代表性哲学家关于指称的观点，并结合塔斯基的满足概念，尝试将指称理论与生成语法的句法运算联系起来，继而以此理论关联为切入点，探讨指称对于生成语法的必要性。在早期以范畴为基础的规则系统，即短语结构语法中，句子被改写成由句法范畴构成的结构系统，然后在每个范畴内选取具体的词语，构成实际使用的语言。这样，这种语言生成方法就不会涉及到指称的问题。而在后来的原则-参数理论中，如果不考虑指称，DP和IP就无法满足扩展的投射原则，从而会导致句法运算的失败。而在最简方案中，完全解释原则要求语言单位在句法运算的每一步都能得到完全解释，即将每一语言单位在每一步运算所产生的结果都解释为意义和语音的结合体。而在最简方案中，如果不考虑指称，DP以及IP，包括时态、情态动词等，都无法得到完全解释，这样句法运算就会“崩溃”（crash）。基于此，本文得出结论，指称对于生成语言学是十分必要的。﹀
外文摘要：	︿ Abstract Saussure considers language as a structured system, with signified and signifier as its two complementary facets. Frege’s theory of reference, based on the distinction of sense and referent, claims that a language sign is to express its sense and to denote its referent. Chomsky in his treatment of semantic problems, repeatedly rejects reference as part of human language system. However, a brief survey of its historical development reveals thatreference relation cannot be neglected, which instead plays a very important role in language computation.This thesis conductsa research on the necessity of reference in Generative Grammar. By surveying the reference theory of Frege, Russell and Strawson, this thesis finds that DP can be defined by more primitive elements, the variables that are undermined, and by assigning truth value to the variable does a DP denote a person or an object in the world. Tarski’s notion of satisfaction defines truth through syntax, which, when connected with the definition of DP, can be used to testify whether reference relation is necessary for Generative Grammar. In the early Category-based Rule System, reference is not involved in language computation. According to this system, a sentence is rewritten as a syntactic structure, which is composed of syntactic categories. Then a word is picked from each category to produce a terminal sentence. In the Principle and Parameter Model, syntactic levels like DP and IP cannot satisfy their respective sentential functions without considering reference, which violates the Extended Projection Principle, and therefore, the language computation cannot move on because projection approach is the basic way of language computation in this model. Then in the Minimalist Program of the Principle and Parameter Model, DP and IP cannot receive their full interpretation without considering reference. Therefore, the syntactic computation will crash for FI Principle is the general property of natural language. Based on these arguments, the thesis concludes that reference is necessary for Generative Grammar. ﹀
分类号：	H04
论文总页数：	64
参考文献总数：	49
参考文献列表：	︿ References Aarsleff, H. 1970. The History of Linguistics and Professor Chomsky. Language. Vol. 46, No. 3: 570-585. Alsena, A. 1992. On the Argument Structure of Causatives. Linguistic Inquiry. Vol. 23, No. 4: 517-555. Antony, L. M. & N. Hornstein. 2003. Chomsky and His Critics. Hoboken, New Jersey: The Blackwell Publishing. Araki, N. 2015. Saussure and Chomsky, Language and I-Language. Bull. Hiroshima Inst. Tech. Research. Vol.49: 1-11. Barman, B. 2012. The Linguistic Philosophy of Noam Chomsky. Philosophy and Progress. Vol LI-LII, January-June: 104-122. Berwick, R. C. & N. Chomsky. 2017. Why Only Us, Recent Questions and Answers. Journal of Neurolinguistics. Vol 43, Part B: 166-177. Black, C. A. A Step-by-step Introduction to the Government and Binding Theory of Syntax. http://www.mexico.sil.org/sites/mexico/files/e002-introgb.pdf. Boskovic, Z. Principles and Parameters and Minimalism. http://web2.uconn.edu/boskovic/papers/PrincParam&Minimalism.DikkenRevised2010Final.pdf. Carnie, A. 2006. Syntax, a Generative Introduction (2nd Edition). Hoboken, New Jersey: The Blackwell Publishing. Carrier, J. & H. J. Randall. 1992. The Argument Structure and Syntactic Structure of Resultative. Linguistic Inquiry. Vol. 23, No. 2: 173-234. Chomsky, N. —1957. Syntactic Structures. Hague: Mouton Publishers. —1965. Aspects of the Theory of Syntax. https://faculty.georgetown.edu/irvinem/theory/Chomsky-Aspects-excerpt.pdf. —1968. Quine’s Empirical Assumptions. Synthese. Vol.19, No 1/2: 53-68. —1981. Knowledge of Language, Its Elements and Origins. Philosophical Transactions of the Royal Society. Vol. 295, Series B: 223-234. —1982. A Note on the Creative Aspect of Language. The Philosophical Review. Vol. 91, No. 3: 423-434. —1984. Noam Chomsky Writes to Mrs. Davis about Grammar and Education. English Education. Vol. 16, No. 3: 165-166. —1986. Knowledge of Language, Its Nature, Origin, and Use. New York, London: Paegen Special Studies. —1992. Explaining Language Use. Philosophical Topics: 205-231. —1994. Models, Nature and Language. Grand Street: 170-176. —1995. Language and Nature. Mind, New Series: Vol. 104, No. 413: 1-61. —1995. The Minimalist Program. Cambridge, MA: The MIT Press. —1997. Language and Problems of Knowledge. Teorema: Revista Inernacional de Filosofia. Vol. 16, No. 2: 5-33. —2000. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press. —2006. Language and Mind (3rd Edition). Cambridge: Cambridge University Press. —2013. Problems of Projection. Lingua 130: 33-49. Chomsky, N., A. J. Gallego & D. Ott. Generative Grammar and the Faculty of Language: Insights, Questions and Challenges. https://www.google.com.hk/url. Chomsky, N. & J. J. Katz. 1971. What the Linguist Is Talking About. The Journal of Philosophy. Vol. 71, No. 12: 347-367. Emonds, J. E. 1991. Subcategorization and Syntax-Based Theta-role Assignment. Natural Language & Linguistic Theory. Vol. 9, Issue. 3: 369-429. Frege, G. 1948. Sense and Reference. The Philosophical Review. Vol. 57, No. 3: 209-230. Freidin. R. 2007. Generative Grammar, Theory and Its History. London and New York: Routledge Taylor & Francis Group. Haegeman, L. 1997. Elements of Grammar, a Handbook of Generative Syntax. Springer: Springer Science + Business Media Dordrecht. Hauser, M. D., N. Chomsky & W. T. Fitch. 2002. The Faculty of Language, What Is It, Who Has it, and How Did It Evolve? Science. Vol. 298, Issue 5598: 1569-1579. Heim, I. &A. Kratzer. 1998. Semantics in Generative Grammar. Oxford: Blackwell Publisher. Jackendoff, R. Reexamining the Foundations of Generative Grammar. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.360.1856&rep=rep1&type=pdf. Katz, J. J. 1980. Chomsky on Meaning. Language: 1-41. Lasnik, H. 2002. The Minimalist Program in Syntax. Trends in Cognitive Sciences. Vol. 6, Issue. 10: 432-437. Lidz, J. & L. Gleitman. 2014. Yes, We Still Need Universal Grammar. Cognition 94: 85-93. Lopez, B. G. 2001. Argument Structure, Thematic Roles and Linking. Atlantis. Vol. 23, No.2: 49-64. Ludlow, P. 2011. The Philosophy of Generative Linguistics. Oxford: The Oxford Press. Putnam, L. R. & N. Chomsky 1994-1995. An Interview with Noam Chomsky. Reading Teacher: 328-333. Roberts, I. 2016. The Oxford Handbook of Universal Grammar. Oxford: Oxford University Press. Runner, J. T. 2002. When Minimalism Isn’t Enough, an Argument for Argument Structure. Linguistic Inquiry. Vol. 33, No. 1: 172-182. Russell, B. 1905. On Denoting. Mind, New Series. Vol. 14, No. 56: 479-493. Russell, B. 2010. The Principles of Mathematics. London and New York: Routledge. Saussure, F. de. 2001. Course in General Linguistics. Beijing: Foreign Language Teaching and Research Press. Stainton, R. J. Meaning and Reference—Some Chomskian Themes. http://publish.uwo.ca. Lepore, E. & B. C. Smith. 1976. Handbook of Philosophy of Language. Oxford: Oxford University Press. Strawson, P. F. 1950. On Referring, Mind. Vol. 59, No. 235: 320-344. Tarski, A. Concept of Truth in Formalized Languages. http://www.thatmarcusfamily.org. ﹀
公开日期：	2019-06-06

2018-05-27

英汉翻译中的变通与忠实.张英杰

链接

题名：	英汉翻译中的变通与忠实
姓名：	张英杰
学号：	1601213263
论文语种：	chi
专业：	专业学 - 翻译硕士 - 英语笔译
公开时间：	公开
培养层次：	硕士
学位：	翻译硕士专业学位
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	朱源
导师1单位：	中国人民大学外国语学院
论文答辩日期：	2018-05-27
外文题名：	Flexibility and Fidelity in E-C Translation
关键词：	英汉翻译变通忠实
外文关键词：	E-C translation flexibility fidelity
论文摘要：	︿在翻译实践中，译者发现，除了理解原文外，翻译的主要工作在于克服语言差异——同一个意思，英语里这样说，汉语里则要换一种说法才能理解。语言间的差异永远存在，怎样变换“说法” 以达意就成了翻译的永恒课题。当然，翻译并不总是在变，本翻译报告提出的论点是：变通和忠实是翻译中的两大原则。首先，作为从一种语言到另一种语言的转换，翻译总体上是一个语言上的归化过程，势必涉及到语言上的变通，才能调和两种语言在语法、表达习惯等方面的差异，达到翻译的主要目的：传达意义；除语言变通之外，翻译自然应有不“变”之处，即应忠实于原作的地方，本报告将“忠实” 这一概念的内涵界定为意义、语言风格、术语三方面的忠实。为阐述这一论点，报告针对《消费时代的迷思》一书的语言特点举例探讨了多种变通策略，如抽象名词的翻译、插入语的处理、长从句的处理、增加逻辑关联词等等，也举例说明了译者如何达到术语翻译的忠实，以及忠实传达出原文语言风格的具体方法。﹀
分类号：	H059
论文总页数：	276
参考文献总数：	14
参考文献列表：	︿冯世则：意译、直译、逐字译，载《中国翻译》，1981年第二期，7-10页。辜正坤：翻译标准多元互补论，载《中国翻译》，1989第一期，100-105页。黄河清，毛荣贵：科技翻译使用括号举隅，载《上海翻译》，1988年第四期，23-24页。姜望琪：论术语翻译的标准，载《上海翻译》，2005第一期，80-84页。黎运汉：1949年以来语言风格定义研究评述，载《语言文字应用》，2002第一期，100-106 页。孙周兴：学术翻译的几个原则——以海德格尔著作之汉译为例证，载《中国翻译》，2013 第四期，70-73页。王克非：近代翻译对汉语的影响，载《外语教学与研究》，2002年第六期，458-463页。王力：《中国现代语法》。上海:商务印书馆，1943。王文华：动静之间，载《中国翻译》，2001年第二期，44-47页。解献芬：试论中西“语言风格”的定义，载《清华大学学报(哲学社会科学版)》2004第一期，55-59。许余龙：《对比语言学》。上海:上海外语教育出版社，2002。余光中：论“的的不休”，载《余光中谈翻译》，北京：中国对外翻译出版公司，2002。 Li, C. N. & Thompson, S. A. "Subject and Topic: A new typology of language." Contemporary Linguistics (1984). Tytler, Alexander Fraser. Essay on the Principles of Translation (1813): New edition. Vol. 13. John Benjamins Publishing, 1978. ﹀
馆藏号：	039/M2018(103)
公开日期：	2018-05-27

2018-05-26

基于深度学习的文本语句扩展系统的设计与实现.于昌和

链接

题名：	基于深度学习的文本语句扩展系统的设计与实现
作者：	于昌和
学号：	1501210770
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-26
关键字(中文)：	深度学习文本扩展 Seq2Seq
文摘：	︿ ~英语是全球使用人数最多的语言之一，在国内，越来越多的人开始学习英语，英语文本写作成为最重要的英语学习内容之一。英语文本语句扩展来源于英语教学写作中的写长法，就是让学生学习各种技巧，使用各种方法，不断地把句子写长，锻炼英语学习和产出能力。那么在实际的语言服务中，写作是很重要的一个环节，这时候不仅仅需要把句子写长，还要把句子写得丰富，写得通畅。本文以句子写长扩展入手，探索使用各种深度学习技术来帮助写作者，完善他们写作的各种技巧和策略。seq2seq在机器翻译中取得了很好的效果，本文在seq2seq中使用不同的编码器和解码器，从三个方面进行文本语句扩展： 1.本文对文本词汇扩展进行研究。一些英文写作的人，写出的句子总是干巴巴的，句子不丰满，不会使用形容词和副词。为了使句子表达的更加丰富和通畅，本文设计了形容词和副词扩展模块，完成对文本句子形容词和副词的补充和扩展。 2.本文对句子续写进行研究。一部分英语学习者，在英文写作中，有时会出现写出上句，不知道下面该怎么写的情况，针对这种情况，本文设计了句子续写模块。 3.本文对句子生成进行研究。探索只使用名词和动词来生成文本句子，因为一些英语写作的人，有时候只会想到几个动词和名词，无法把这些词组织成一句完整和通顺的句子，针对这种情况，本文设计了句子生成模块。本文的词汇扩展模块，可以很好地帮助写作者完成形容词和副词的补充,在训练集上可以完全恢复63%的形容词和副词。在900条测试语句中，模型扩展的句子在语言模型上的得分只比目标句子的得分少0.14。在上下句续写任务中，模型可以一定程度上给写作者提供思路；在基于动词和名词完成句子自动生成的任务上，取了一定的进展，BLEU值达到了27.23，单词覆盖率达到63%。﹀
分类号：	TP3
论文总页数：	60
参考文献数：	38
参考文献：	︿ [1] Mccoy K F. Simple NLP Techniques for Expanding Telegraphic Sentences[J]. Sentences Natural Language Processing for Communication Aids,1997, 2007. [2] Artificial Neural Networks[J]. Encyclopedia of Microfluidics & Nanofluidics:23-33. [3] Rosenblatt F. The perception: a probabilistic model for information storage and organization in the brain[M] Neurocomputing: foundations of research. MIT Press, 1988:386-408. [4] Jeffrey L. Elman. Finding Structure in Time[J]. Cognitive Science,1990, 14(2):179 -211. [5] Gregor K, Danihelka I, Graves A, et al. DRAW: a recurrent neural network for image generation[J]. Computer Science, 2015:1462-1471. [6] Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network based language model[C]// INTERSPEECH 2010, Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September. DBLP, 2010:1045-1048. [7] Li L, Jin L, Jiang Z, et al. Biomedical named entity recognition based on extended Recurrent Neural Networks[C]// IEEE International Conference on Bioinformatics and Biomedicine. IEEE, 2015:649-652. [8] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J]. Computer Science, 2014. [9] Fan E G. Extended Tanh-function Method and its Applications to Nonlinear Equations[J]. Physics Letters A, 2000, 277(4):212-218. [10] Hecht-Nielsen R. Theory of the backpropagation neural network[M].Neural networks for perception (Vol. 2). Harcourt Brace & Co. 1992:593-605 vol.1. [11] Schmidhuber J, rgen. Deep learning in neural networks[M]. Elsevier Science Ltd. 2015. [12] Schuster M, Paliwal K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 2002, 45(11):2673-2681. [13] Hochreiter S. LSTM can solve hard long time lag problems[C]// International Conference on Neural Information Processing Systems. MIT Press, 1996:473-479. [14] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8):1735-1780. [15] Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Netw, 2005, 18(5):602-610. [16] Gers F A, Schraudolph N N. Learning precise timing with lstm recurrent networks[M]. JMLR.org, 2003. [17] Gers F A, Schmidhuber J, Cummins F. Learning to forget: continual prediction with LSTM[J]. Neural Computation, 2000, 12(10):2451-2471. [18] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J]. Computer Science, 2014. [19] Fukushima K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position[J]. Biological Cybernetics, 1980, 36(4):193-202. [20] Lecun Y, Boser B, Denker J S, et al. Backpropagation Applied to Handwritten Zip Code Recognition[J]. Neural Computation, 2014, 1(4):541-551. [21] Yin W, Schütze H, Xiang B, et al. ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs[J]. Computer Science, 2015. [22] Wang L, Cao Z, Melo G D, et al. Relation Classification via Multi-Level Attention CNNs[C]// Meeting of the Association for Computational Linguistics. 2016:1298-1307. [23] Zhu J, Qiao J, Dai X, et al. Relation Classification via Target-Concentrated Attention CNNs[J]. 2017:137-146. [24] Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model.[M] Innovations in Machine Learning. Springer Berlin Heidelberg, 2006:137-186. [25] Bojanowski P, Grave E, Joulin A, et al. Enriching Word Vectors with Subword Information[J]. 2016. [26] Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation[C]// Conference on Empirical Methods in Natural Language Processing. 2014:1532-1543. [27] Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[J]. 2014, 4:3104-3112. [28] Jaitly N, Sussillo D, Le Q V, et al. A Neural Transducer[J]. Computer Science, 2016. [29] Vinyals O, Le Q. A Neural Conversational Model[J]. Computer Science, 2015. [30] Jean S, Cho K, Memisevic R, et al. On Using Very Large Target Vocabulary for Neural Machine Translation[J]. Computer Science, 2015. [31] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[J]. Computer Science, 2014. [32] Jaitly N, Sussillo D, Le Q V, et al. A Neural Transducer[J]. Computer Science, 2016. [33] Britz D, Goldie A, Luong M T, et al. Massive Exploration of Neural Machine Translation Architectures[J]. 2017. [34] Papineni S. Blue ; A method for Automatic Evaluation of Machine Translation[C]// Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2002. [35] Gehring J, Auli M, Grangier D, et al. Convolutional Sequence to Sequence Learning[J]. 2017. [36] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1):1929-1958. [37] Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. Computer Science, 2012, 3(4): 212-223. [38] Kingma D, Ba J. Adam: A Method for Stochastic Optimization[J]. Computer Science, 2014. ﹀
馆藏号：	017/M2018(311)
公开日期：	2021-05-26

基于多人在线战术竞技游戏的虚拟团队数据分析与研究.曾伊蕾

链接

题名：	基于多人在线战术竞技游戏的虚拟团队数据分析与研究
姓名：	曾伊蕾
学号：	1401210506
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	1年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-26
关键词：	计算社会科学个人表现研究虚拟团队的网络科学
论文摘要：	︿计算社会科学（CSS）是计算机技术和社会科学的交叉学科，本研究是该学科对个体行为和团体行为研究的具体实例，受该领域基金支持。本文研究的目的是量化，追踪和预测人类个体和团队行为表现在游戏化虚拟团队环境中常见和异常的行为轨迹，旨在帮助虚拟和现实团队提升整体表现，为在线游戏化平台的个性化激励提供支持。本文研究的创新性和重要性体现于，对个体行为表现研究中存在的四类问题的针对性解决：个体行为通常以一刀切的方式建模，本文分别从角色，经验，技能，团队网络结构等层面对个体进行多角度的个性化建模；个体行为模型通常不包含时间动态信息，本文所有的模型都考量了人类行为轨迹随时间的动态演变；个体行为模型通常忽略了社会网络效应，本文第五章着重于研究不同网络结构所带来的影响；个体行为模型通常不具概括性，可重复性，可测试性和可解释性。本文方法都是可解释可重复的，实验结果证明本研究的结论具有跨游戏平台的普遍性。论文首先构建了个人表现随时间的动态演变模型，该模型分析了多人在线战术竞技游戏（MOBA）英雄联盟的玩家数据。通过针对长期行为的回归分析和短期行为的游戏块分析，用数据事实揭示出与一般直觉不同的结论，即短期游戏块内个人行为呈现恶化效应，个人表现提升和长期经验无直接联系，但经验可缓解个人的短期表现恶化。论文使用机器学习算法搭建了能准确预测出玩家何时选择继续或结束当前游戏块的嵌套模型，揭示了决定去留的关键因素。之后论文在该时间模型的基础上构建了个人表现随角色选择的动态演变模型，该模型使用的是MOBA游戏刀塔2的玩家数据。论文通过统计分析定义了不同角色，结果显示出跨角色的个人短期行为热身现象。该模型分别将个体按经验，技能和角色等进行了个性化分类，实验结果揭示了个体玩家成功的模式。最后论文在时间模型的基础上进一步针对网络结构对个体和团队表现所产生的影响进行了建模，该模型不仅使用了MOBA游戏的海量数据还结合了玩家真实朋友关系数据。本文对团队网络结构进行了细分，并应用网络科学，经济学原理和数理统计对随时间动态演变的个体和团队行为表现进行了分析，结果表明低能力团队会因组成网络结构的玩家产生正外部性，从而能提升团队内个体和团队整体的行为和表现。高水平团队需要有意识的让低水平个体和高水平个体搭配，将负外部性内部化来帮助提升团队和个体表现。本文实验结果还显示，密切的团队内部联系能够帮助缓解短期表现恶化效应。虽然本文是关于特定领域的研究，但是所得出的理论结果，建立的动态模型以及使用的分析方法均可应用到更抽象，描述和解释人类行为的上下文中。﹀
分类号：	TP3
论文总页数：	127
参考文献总数：	72
参考文献列表：	︿ [1] Ajzen I. The theory of planned behavior, organizational behavior and human decision processes.[J]. Journal of Leisure Research, 1991, 50(2):176-211. [2] Hamari J, Koivisto J, Sarsa H. Does Gamification Work? -- A Literature Review of Empirical Studies on Gamification.[C] Hawaii International Conference on System Sciences. IEEE, 2014:3025-3034. [3] Farzan R, Dimicco J M, Millen D R, et al. Results from deploying a participation incentive mechanism within the enterprise.[C] Sigchi Conference on Human Factors in Computing Systems. ACM, 2008:563-572. [4] Hey T. The Fourth Paradigm – Data-Intensive Scientific Discovery.[J]. Proceedings of the IEEE, 2011, 99(8):1334-1337. [5] Lazer D, Pentland A, Adamic L, et al. Life in the network: the coming age of computational social science.[J]. Science, 2016, 323(5915):721-723. [6] Conte R, Gilbert N, Bonelli G, et al. Manifesto of computational social science.[J]. European Physical Journal Special Topics, 2012, 214(1):325-346. [7] Lazer D, Pentland A, Adamic L, et al. Social science. Computational social science.[J]. Science, 2009, 323(5915):721-3. [8] Centola D. The Spread of Behavior in an Online Social Network Experiment.[J]. Science, 2010, 329(5996):1194-1197. [9] Calvó-Armengol A, Jackson M O. Like Father, Like Son: Social Network Externalities and Parent-Child Correlation in Behavior.[J]. American Economic Journal Microeconomics, 2009, 1(1):124-150. [10] Lewis K, Gonzalez M, Kaufman J. Social selection and peer influence in an online social network.[J]. Proceedings of the National Academy of Sciences of the United States of America, 2012, 109(1):68-72. [11] Chudoba K M, Wynn E, Lu M, et al. How virtual are we? Measuring virtuality and understanding its impact in a global organization.[J]. Information Systems Journal, 2005, 15(4):279–306. [12] Townsend A M, Hendrickson A R. Virtual Teams: Technology and the Workplace of the Future.[J]. Academy of Management Executive, 1998, 12(3):17-29. [13] Richard H J, Nancy K. Group Behavior and Performance.[M]// Handbook of Social Psychology. 2010:1258-63. [14] Hertel G, Niedner S, Herrmann S. Motivation of software developers in Open Source projects: an Internet-based survey of contributors to the Linux kernel.[J]. Research Policy, 2003, 32(7):1159-1177. [15] Clark J, Leavitt A, Williams D. Online Games, Community Aspects of.[M] The International Encyclopedia of Digital Communication and Society. John Wiley & Sons, Inc. 2015. [16] Huang Y, Ye W, Bennett N, et al. Functional or social?:exploring teams in online games.[C] Conference on Computer Supported Cooperative Work. 2013:399-408. [17] Ducheneaut N, Moore R J. The social side of gaming: a study of interaction patterns in a massively multiplayer online game.[C] ACM Conference on Computer Supported Cooperative Work. ACM, 2004:360-369. [18] Shen C. Network patterns and social architecture in Massively Multiplayer Online Games: Mapping the social world of EverQuest II.[J]. New Media & Society, 2014, 16(4):672-691. [19] Assmann J J, Drescher M A, Gallenkamp J V, et al. MMOGs as Emerging Opportunities for Research on Virtual Organizations and Teams.[C] Americas Conference on Information Systems, Amcis 2010, "sustainable It Collaboration Around the Globe.", Lima, Peru, August. DBLP, 2010:335. [20] Goh S, Wasko M. The effects of leader-member exchange on member performance in virtual world teams.[J]. Journal of the Association for Information Systems, 2012, 13(10):861-885. [21] Nardi B, Harris J. Strangers and Friends: Collaborative Play in World of Warcraft.[C] ACM Conference on Computer Supported Cooperative Work, CSCW 2006, Banff, Alberta, Canada, November. DBLP, 2006:149-158. [22] Kou Y, Gui X. Playing with strangers: understanding temporary teams in league of legends.[C] ACM Sigchi Symposium on Computer-Human Interaction in Play. ACM, 2014:161-169. [23] Park K, Cha M, Kwak H, et al. Achievement and Friends: Key Factors of Player Retention Vary Across Player Levels in Online Multiplayer Games[C]// International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 2017:445-453. [24] Bardzell S, Bardzell J, Pace T, et al. Blissfully productive: grouping and cooperation in world of warcraft instance runs.[C] ACM Conference on Computer Supported Cooperative Work. ACM, 2008:357-360. [25] Tyack A, Wyeth P, Johnson D. The Appeal of MOBA Games: What Makes People Start, Stay, and Stop[C]// Symposium on Computer-Human Interaction in Play. ACM, 2016:313-325. [26] Benefield G A, Shen C, Leavitt A. Virtual Team Networks: How Group Social Capital Affects Team Success in a Massively Multiplayer Online Game.[C] ACM Conference on Computer-Supported Cooperative Work & Social Computing. ACM, 2016:679-690. [27] Kim J, Keegan B C, Park S, et al. The Proficiency-Congruency Dilemma: Virtual Team Design and Performance in Multiplayer Online Games.[J]. Computer Science, 2015:4351-4365. [28] Leavitt A, Keegan B C, Clark J. Ping to Win?: Non-Verbal Communication and Team Performance in Competitive Online Multiplayer Games.[C] CHI Conference on Human Factors in Computing Systems. ACM, 2016:4337-4350. [29] Kim Y J, Engel D, Mcarthur N, et al. What Makes a Strong Team?: Using Collective Intelligence to Predict Team Performance in League of Legends.[C] ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 2017:2316-2329. [30] Huang J, Zimmermann T, Nagapan N, et al. Mastering the art of war:how patterns of gameplay influence skill in Halo.[C] Sigchi Conference on Human Factors in Computing Systems. 2013:695-704. [31] Vicencio-Moreira R, Mandryk R L, Gutwin C. Now You Can Compete With Anyone: Balancing Players of Different Skill Levels in a First-Person Shooter Game.[C] ACM Conference on Human Factors in Computing Systems. ACM, 2015:2255-2264. [32] Sievertsen H H, Gino F, Piovesan M. Cognitive fatigue influences students’ performance on standardized tests.[J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 113(10):2621. [33] Borghini G, Astolfi L, Vecchiato G, et al. Measuring neurophysiological signals in aircraft pilots and car drivers for the assessment of mental workload, fatigue and drowsiness.[J]. Neuroscience & Biobehavioral Reviews, 2014, 44:58-75. [34] Muraven M, Baumeister R F. Self-regulation and depletion of limited resources: does self-control resemble a muscle?[J]. Psychological Bulletin, 2000, 126(2):247-59. [35] Kooti F, Moro E, Lerman K. Twitter Session Analytics: Profiling Users’ Short-Term Behavioral Changes.[M] Social Informatics. Springer International Publishing, 2016:71-86. [36] Singer P, Ferrara E, Kooti F, et al. Evidence of Online Performance Deterioration in User Sessions on Reddit[J]. Plos One, 2016, 11(8):e0161636. [37] Scerbo M W. Stress, Workload and Boredom in Vigilance: A Problem and an Answer.[J]. Stress Workload & Fatigue, 2001. [38] Warm J S, Matthews G, Finomore V S Jr. Vigilance, workload, and stress.[J]. Performance under stress, 2008:115-41. [39] Boksem M A, Tops M. Mental fatigue: costs and benefits.[J]. Brain Research Reviews, 2008, 59(1):125-139. [40] Marcora S M, Staiano W, Manning V. Mental fatigue impairs physical performance in humans.[J]. Journal of Applied Physiology, 2009, 106(3):857-64. [41] Lim J, Wu W C, Wang J, et al. Imaging brain fatigue from sustained mental workload: an ASL perfusion study of the time-on-task effect[J]. Neuroimage, 2010, 49(4):3426-3435. [42] Pattyn N, Neyt X, Henderickx D, et al. Psychophysiological investigation of vigilance decrement: boredom or cognitive fatigue?[J]. Physiology & Behavior, 2008, 93(1-2):369. [43] Lorist M M, Boksem M A S, Ridderinkhof K R. Impaired cognitive control and reduced cingulate activity during mental fatigue.[J]. Brain Research Cognitive Brain Research, 2005, 24(2):199. [44] Boksem M A, Meijman T F, Lorist M M. Effects of mental fatigue on attention: an ERP study[J]. Brain Res Cogn Brain Res, 2005, 25(1):107-116. [45] Boksem M A, Meijman T F, Lorist M M. Mental fatigue, motivation and action monitoring.[J]. Biological Psychology, 2006, 72(2):123-132. [46] Demerouti E, Bakker A B, Nachreiner F, et al. The job demands-resources model of burnout.[J]. J Appl Psychol, 2001, 86(3):499-512. [47] G. Robert J. Hockey, A. John Maule, Peter J. Clough, et al. Effects of negative mood states on risk in everyday decision making.[J]. Cognition & Emotion, 2000, 14(6):823-855. [48] Sanders A F. Elements of human performance:, Reaction processes and attention in human skill.[M] Elements of Human Performance: Reaction Processes and Attention in Human Skill. Lawrence Erlbaum Associates, 1998:231-234. [49] Van d L D, Frese M, Meijman T F. Mental fatigue and the control of cognitive processes: effects on perseveration and planning.[J]. Acta Psychologica, 2003, 113(1):45. [50] Danziger S, Levav J, Avnaimpesso L. Extraneous factors in judicial decisions.[J]. Proceedings of the National Academy of Sciences of the United States of America, 2011, 108(17):6889. [51] Vohs K D, Baumeister R F, Schmeichel B J, et al. Making choices impairs subsequent self-control: a limited-resource account of decision making, self-regulation, and active initiative.[J]. Journal of Personality & Social Psychology, 2008, 94(5):883-98. [52] Mullettegillman O A, Leong R L, Kurnianingsih Y A. Cognitive Fatigue Destabilizes Economic Decision Making Preferences and Strategies.[J]. 2015, 10(7). [53] Page S E. The Difference:How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies (New Edition).[M]. Princeton University Press, 2008. [54] Jia P, Mirtabatabaei A, Friedkin N E, et al. Opinion Dynamics and the Evolution of Social Power in Influence Networks.[J]. Siam Review, 2013, 57(3):367-397. [55] Woolley A W, Chabris C F, Pentland A, et al. Evidence for a collective intelligence factor in the performance of human groups.[J]. Science, 2010, 330(6004):686-688. [56] Ferrara E, Alipourfard N, Burghardt K, et al. Dynamics of Content Quality in Collaborative Knowledge Production.[J]. 2017. [57] Halfaker A, Keyes O, Kluver D, et al. User Session Identification Based on Strong Regularities in Inter-activity Time.[C] International World Wide Web Conferences Steering Committee, 2015:410-418. [58] Ho T K. The Random Subspace Method for Constructing Decision Forests[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1998, 20(8):832-844. [59] Friedman J, Hastie T, Tibshirani R. The Elements of Statistical Learning.[J]. Journal of the Royal Statistical Society, 2001, 167(1):267-268. [60] Friedman J H. Greedy Function Approximation: A Gradient Boosting Machine.[J]. Annals of Statistics, 2001, 29(5):1189-1232. [61] Freund Y, Schapire R, Abe N. A short introduction to boosting.[J]. Journal-Japanese Society For Artificial Intelligence, 1999, 14:771-780. [62] Schapire R E, Singer Y. Improved Boosting Algorithms Using Confidence-rated Predictions.[J]. Machine Learning, 1999, 37(3):297-336. [63] Radicchi F, Fortunato S, Markines B, et al. Diffusion of scientific credits and the ranking of scientists.[J]. Physical Review E Statistical Nonlinear & Soft Matter Physics, 2009, 80(2):056103. [64] Sinatra R, Wang D, Deville P, et al. Quantifying the evolution of individual scientific impact.[J]. Science, 2016, 354(6312):aaf5239-aaf5239. [65] Rodi G C, Loreto V, Servedio V D P, et al. Optimal Learning Paths in Information Networks.[J]. Scientific Reports, 2015, 5:10286. [66] Memmert D, Lemmink K A, Sampaio J. Current Approaches to Tactical Performance Analyses in Soccer Using Position Data.[J]. Sports Medicine, 2016:1-10. [67] Cha M, Haddadi H, Benevenuto F, et al. Measuring User Influence in Twitter: The Million Follower Fallacy.[C] International Conference on Weblogs and Social Media, Icwsm 2010, Washington, Dc, Usa, May. DBLP, 2010. [68] Hong L, Dan O, Davison B D. Predicting popular messages in Twitter.[C] International Conference on World Wide Web, WWW 2011, Hyderabad, India, March 28 - April. DBLP, 2011:57-58. [69] Movshovitz-Attias D, Movshovitz-Attias Y, Steenkiste P, et al. Analysis of the reputation system and user contributions on a question answering website: StackOverflow.[C] Ieee/acm International Conference on Advances in Social Networks Analysis and Mining. ACM, 2013:886-893. [70] Pobiedina N, Neidhardt J, Moreno M D C C, et al. On Successful Team Formation: Statistical Analysis of a Multiplayer Online Game.[C] Business Informatics. IEEE, 2013:55-62. [71] Becker R, Chernihov Y, Shavitt Y, et al. An analysis of the Steam community network evolution.[C]// Electrical & Electronics Engineers in Israel. IEEE, 2012:1-5. [72] Blackburn J, Kourtellis N, Skvoretz J, et al. Cheating in Online Games: A Social Network Perspective.[J]. Acm Transactions on Internet Technology, 2014, 13(3):9. ﹀
馆藏号：	017/M2018(336)
公开日期：	2019-05-26

基于神经网络的影视剧向量表示模型.隋春宁

链接

题名：	基于神经网络的影视剧向量表示模型
作者：	隋春宁
学号：	1501210674
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-26
题目(外文)：	A Video Content Embedding Model Using Neural Networks
关键字(中文)：	分布式表示神经网络视频内容表示
关键字(外文)：	Distributed representation Neural networks Video content embedding
文摘：	︿随着视频网站不断发展，影视剧数据和用户数量都大幅上升，对影视剧的自动分类、推荐等任务产生了大量需求。传统上，视频网站的分类信息往往来源于人工编辑，推荐系统则主要依据用户行为数据和协同过滤算法。由于标注人力有限和数据稀疏问题，人工分类的可扩展性是一大瓶颈，冷门影视剧或者新用户的推荐结果也存在局限。本论文采用神经网络，对影视剧的标签、剧情梗概等不同来源的异质文本数据进行降维和整合，将原始文本数据映射到语义空间中，得到基于内容的低维向量表示。这种分布式的向量表示模型在深度学习中称为嵌入模型，近年来在自然语言处理领域受到广泛关注和研究，并在诸多任务上取得突破进展。本文首先研究了不同粒度的文本数据的建模方式，综述了单词、短语、句子、段落级别的分布式语义表示模型的概念和方法，并探讨如何将其应用于影视剧场景下。其次，本文基于神经网络，建立了影视剧内容的向量表示模型，通过改进的负采样训练策略，将不同粒度、不同来源的文本元数据融合为一致语义空间下的向量表示。研究表明，使用神经网络的分布式向量表示模型，能够对现有影视剧的内容进行有效的建模，并可以应用于新增加的影视剧数据。该模型可以应用于自动推荐、聚类等任务。﹀
文摘（外文）：	︿ With the continuous advancement of online video providers, the number of movies and television series online has risen significantly, alongside with the amount of user data. A great demand has arisen for such tasks as the automatic classification and recommendation of such video contents. Traditionally, the classification information of video sites often comes from manual editors, while recommendation systems mainly rely on user behavioral data and collaborative filtering algorithms. Due to the limited man-hours of labeling and the problem of data sparseness, the scalability of manual classification is a big bottleneck; the recommended results for unpopular movies or new users are also limited. In this dissertation, neural networks are employed in the dimensionality reduction and integration of heterogeneous text data from different sources, such as labels and synopsis of movies and television series. Through mapping from raw texts to the semantic space, we get low-dimensional vector representations based on their contents. This distributed vector representation model is called an embedded model in deep learning. In recent years, it has received extensive attention and research in the field of natural language processing, and has made breakthroughs in many tasks. Firstly, this dissertation studies how to model text data with different granularities, reviews the concepts and methods of distributed semantic representation models of words, phrases, sentences and paragraphs, and discusses how to apply these models in the context of movies and television series. Secondly, a vector representation model of video contents is established using neural networks. Text metadata of different granularity and from different sources are merged and mapped into a vector representation in a consistent semantic space via an improved negative sampling training strategy. The study shows that the distributed vector representation model of neural networks can effectively model the content of the existing movies and television series, and can be easily applied to newly-added content data. The model can be applied to automatic recommendation, clustering and other tasks. ﹀
分类号：	TP3
论文总页数：	61
参考文献数：	60
参考文献：	︿ Andrew G, Arora R, Bilmes J A, et al. 2013. Deep canonical correlation analysis. ICML, 1247-1255. Bach F R, Jordan M I. 2002. Kernel independent component analysis. Journal of Machine Learning Research, Issue 3, 1-48. Baroni M, Dinu G, Kruszewski G. 2016. Don't count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. Proc. ACL, Volume 1, 238–247. Bengio Y, Ducharme R, Vincent P, et al. 2003. A neural probabilistic language model. Journal of Machine Learning Research, Issue 3, 1137-1155. Blei D. M., Ng A. Y., Jordan M. I. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, Issue 3, 993–1022. Bojanowski P., Grave E., Joulin A, Mikolov, T., 2017. Enriching word vectors with subword information. arXiv: 1607.04606. Boureau Y-L, Ponce J, LeCun Y. 2010. A theoretical analysis of feature pooling in visual recognition. ICML, 111-118. Bullinaria J A, Levy J P. 2012. Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD. Behavior research methods , 44(3), 890-907. Chandar S, Lauly S, Larochelle H, et al. 2014. An autoencoder approach to learning bilingual word representations. Proceedings of NIPS 2014. Cho K, van Merrienboer B, Gulcehre C, et al. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv:1406.1078. Chung J, Gulcehre C, Cho K, et al. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv: 1412.3555. Cohn D A, Hofmann T. 2000. The missing link -- A probabilistic model of document content and hypertext connectivity.. NIPS, 430-436. Collobert R, Weston J. 2008. A unified architecture for natural language processing: deep neural networks with multitask learning. ICML, 160–167. Collobert R, Weston J, Bottou L, et al. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research, Issue 12, 2493--2537. Conneau A, Lample G, Ranzato M, et al. 2017. Word translation without parallel data. arXiv: 1710.04087. Conneau A, Schwenk H, Barrault L, et al. 2017. Very Deep Convolutional Networks for Text Classification. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Volume 1, 1107-1116. Cybenko G. 1989. Approximations by superpositions of sigmoidal functions. Mathematics of Control, Signals, and Systems, 2(4), 303-314. Deerwester S, Dumais S T, Furnas G W, et al. 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391–407, 477, 482. Faruqui M, Dyer C. 2014. Improving vector space word representations using multilingual correlation. Proceedings of EACL 2014. Fawcett T. 2006. An introduction to roc analysis. Pattern recognition letters, 27(8), 861-874. Feng F, Wang X, Li R. 2014. Cross-modal retrieval with correspondence autoencoder. ACM Multimedia 2014, 7-16. Firth J R. 1957. A synopsis of linguistic theory. s.l.:s.n. Golub G H, Reinsch C. 1970. Singular value decomposition and least squares solutions. Numerische mathematik, 14(5), 403-420. Goodfellow I, Bengio Y, Courville A. 2016. Deep Learning. s.l.:MIT Press. Gouws S, Bengio Y, Corrado G. 2015. BilBOWA: Fast Bilingual Distributed Representations without Word Alignments. arXiv: 1410.2455. Hermann K M, Blunsom P. 2013. Multilingual distributed representations without word alignment. arXiv:1312.6173. Hinton G E, Srivastava N, Krizhevsky A, et al. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580. Hochreiter S, Schmidhuber J. 1997. Long short-term memory. Neural computation , 9(8), 1735-1780. Hofmann T. 1999. Probabilistic Latent Semantic Indexing. Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval, 289-296. Hotelling H. 1936. Relations between two sets of variates. Biometrika , 28(3/4), 321-377. Insall M, Rowland T, Weisstein E W. 2018. Embedding. [Online] Available at: http://mathworld.wolfram.com/Embedding.html [Accessed 2018-04-01]. Kalchbrenner N, Grefenstette E, Blunsom P. 2014. A convolutional neural network for modeling sentences. arXiv: 1606.04640. Karpathy A. 2018. CS231n: Convolutional Neural Networks for Visual Recognition. [Online] Available at: http://cs231n.github.io [Accessed 2018-04-01]. Karpathy A, Fei-Fei L. 2014. Deep visual-semantic alignments for generating image descriptions. CoRR 2014. Kim Y. 2014. Convolutional neural networks for sentence classification. EMNLP 2014, 1746–1751. Kingma D P, Ba J L. 2015. Adam: A method for stochastic optimization. ICLR 2015. Kiros R, Zhu Y, Salakhutdinov R, et al. 2015. Skip-Thought Vectors. arXiv: 1506.06726. Lebret R, Collobert R. 2013. Word emdeddings through hellinger PCA. arXiv: 1312.5542. LeCun Y, Bottou L, Bengio Y, et al. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, Volume 86, 2278-2324. Le Q V, Mikolov T. 2014. Distributed representations of sentences and documents. ICML, 1188-1196. Levy O, Goldberg Y. 2014. Neural word embedding as implicit matrix factorization. Proceedings of NIPS 2014, 2177-2185. Levy O, Goldberg Y. 2015. Improving distributional similarity with lessons learned from word embeddings. TACL, Issue 3, 211-225. Li Y, Yang M, Zhang Z. 2015. Multi-View Representation Learning: A Survey from Shallow Methods to Deep Methods. arXiv: 1610.01206. Maas A L, Hannun A Y, Ng A Y. 2013. Rectifier Nonlinearities Improve Neural Network Acoustic Models. Proceedings of the 30th International Conference on Machine Learning ., JMLR: 28. Mikolov T, Chen K, Corrado G, et al. 2013a. Efficient estimation of word representations in vector space. arXiv: 1301.3781.. Mikolov T, Le Q V, Sutskever I. 2013b. Exploiting similarities among languages for machine translation. International Conference on Learning Representations. Mikolov T, Sutskever I, Chen K, et al., 2013c. Distributed representations of words and phrases and their compositionality. Proceedings of NIPS 2013, 3111-3119. Mitra B, Craswell N. 2017. Neural Models for Information Retrieval. arXiv: 1705.01509. Nair V, Hinton G E. 2010. Rectified linear units improve restricted boltzmann machines. s.l., s.n. Pennington J, Socher R, Manning C D. 2014. Glove: Global vectors for word representation. Proceedings of EMNLP 2014, 1532-1543.. Rehurek R. 2011. Fast and Faster: A comparison of two streamed matrix decomposition methods. arXiv: 1102.5597. Rumelhart D E, Hinton G E, Williams R J. 1986. Learning representations by back-propagating errors. Nature, Issue 323, 533–536. Schmidhuber J. 2014. Deep Learning in Neural Networks: An Overview. Technical Report IDSIA-03-14. arXiv:1404.7828. Smith S L, Turban D H P, Hamblin S, et al. 2017. Offline bilingual word vectors, orthogonal transformations and the inverted softmax. arXiv: 1702.03859. Socher R, Perelygin A, Wu J Y,. et al. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. Proceedings of EMNLP, 1631-1642. Weston J, Chopra S, Adams K. 2014. #TagSpace: semantic embeddings from hashtags. Proceedings of EMNLP, 1822-1827. Zhang X, Zhao J, LeCun Y. 2015. Character-level convolutional networks for text classification. NIPS, 649–657. Zhao Z, Liu T, Li S, et al. 2017. Ngram2vec – learning improved word representations from ngram co-occurrence statistics. Proceedings of EMNLP 2017, 244—253. van der Maaten L, Hinton G. 2008. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research, Issue 9, 2579-2605. ﹀
馆藏号：	017/M2018(368)
公开日期：	2021-05-26

面向移动端的用户检索实体抽取系统设计与实现.曹圣明

链接

题名：	面向移动端的用户检索实体抽取系统设计与实现
作者：	曹圣明
学号：	1501210487
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-26
题目(外文)：	Design and Implementation of Entity Extraction System in User Query For Mobile Terminal Devices
关键字(中文)：	深度学习实体抽取智能语义交互注意力机制词向量硬件适配模型压缩
关键字(外文)：	deep learning entity extraction intelligent semantic interaction attention mechnism word embedding hardware adaptation model compression
文摘：	︿实体抽取作为自然语言处理的基本任务，在深度学习兴起之际，又取得了一系列突破性的进展。它作为问答系统、人机对话和机器翻译等任务的基础部分，所起的作用是不可替代的。而近来，随着人工智能的兴起和智能语义交互需求的增加，用户检索中的实体抽取成为很重要的一项功能，它相对于传统命名实体识别具有更宽广的领域需求，更严格的精度和准度需求以及更复杂的用户交互逻辑。我们可以借助实体识别结果，完成一系列的资源请求和服务分发，完成用户的需求，以及引导用户的潜在需求，这是新型的文本交互中非常重要的一环。本文基于此目标实现了线上和线下两套系统，其核心系统是实体抽取功能，辅以必要的模式匹配模块，以满足用户的热点需求和修正模型的识别缺陷。关于实体抽取部分，我们主要基于tensorflow框架对模型进行训练、调优和部署。在基线部署上，本文创新性地采用了seq2seq结构，实现了命名实体识别的基础框架；然后根据训练数据规模、输入模块粒度、归一化和注意力机制等对基线模型进行了调优；最后从词向量生成方法、注意力机制和新型模型三个方面对模型的结构进行了改进和优化。最终使得模型的效果提高了10多个点。在算法迭代过程中，我们通过整合模型和词向量增强，取得了最优的结果。最后，我们在微软的命名实体识别公开测试集上进行了模型的测试，并达到了比较好的结果。CNN编码器的实践、注意力机制的深度探讨以及实体去歧模型的调研，将作为本文后续的研究方向。其次在移动端的模型部署上，本文还针对硬件和软件两个方面进行了深层次的优化。软件方面，我们分别进行了模型压缩和数据结构优化；硬件方面则进行了依赖分离和硬件适配。总的来说，较好地解决了深度学习模型在移动端部署时所存在的内存占用高、执行效率低等问题，里边的诸多解决方法有很多值得借鉴的地方。﹀
文摘（外文）：	︿ As the basic task of Natural Language Processing, Entity Extraction has broken through with the rising of deep learning. Named Entity Extraction has played an irreplaceable role in QA system, interactive chat and machine translation and so on. Recently, with the ascending demands for intelligent semantic intercation and AI's boosting, Entity Extraction has been emerging as a flashpoint in user query precessing. Compared to the traditional named entity recognition, it has a broader fields freedom, more strict limits on precision and recall rate and more sophisticated interactive routines. Based on the extraction results, we can complete a series of resources request and service dispatch , in order not only to meet the users' demands, but also motivate their potential desirements. So, we have implemented two systems, namely online and offline versions, which are composed of entity extraction part and related pattern matching module. The latter module is of necessity to suit the hot queries and compensate for the model's incapability. In this paper, we mainly use tensorflow for model training, fine-tuning and deployment. As to the baseline of our experiments, we used seq2seq architecture instead of the encoder-only method and achieved better results. And then we tuned our baseline in terms of data scale, input granularity, regularization and attention mechnism. At last, we made some changes in model architecture by way of embedding ajustment, attention machinism and novel methods to yiled a better performance. Totally, the new results outperform the older one by over 10 percentage. To sum up, we tested our model on the open MSRA NER dataset and achieved state-of-art performance. Furthermore, we will continue our research in field of CNN encoder, attention mechnism and entity disambiguation. As for the deployment on mobile devices, we have done some optimizations and improvements in terms of software and hardware. More specifically,the former is mainly composed of model compression and data structure optimization. As for the latter, we mainly used dependency release and instruction-level adaptation. Overall, we have solved the problem of high memory occupation and low inference rate we may ecounter while deploying deep learning model on mobile devices. ﹀
分类号：	TP3
论文总页数：	124
参考文献数：	72
参考文献：	︿ [1] Maha Althobaiti, Udo Kruschwitz and Massimo Poesio. Combining Minimally-supervised Methods for Arabic Named Entity Recognition. 2015,3. [2] Jimmy Lei Ba, Jamie Ryan Kiros and Geoffrey E Hinton. Layer Normalization. 2016. [3] Dzmitry Bahdanau, Kyunghyun Cho and Yoshua Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR, 2014, abs/1409.0473. http://arxiv.org/abs/1409. 0473. [4] Yoshua Bengio, Réjean Ducharme, Pascal Vincent et al. A Neural Probabilistic Language Model. Journal of Machine Learning Research, 2004-02-05: 1137–1155. http://dblp.uni-trier.de/ db/journals/jmlr/jmlr3.html#BengioDVJ03. [5] Daniel M. Bikel, Richard Schwartz and Ralph M. Weischedel. An Algorithm that Learns What’s in a Name. Machine Learning, 1999, 34(1-3): 211–231. [6] Andrew Borthwick, John Sterling, Eugene Agichtein et al. Description of the MENE named entity system as used in MUC-7. 1998. [7] Andrew Borthwick, John Sterling, Eugene Agichtein et al. Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition. In: 1998: 152–160. [8] Randall L Calvert. Robustness of the Multidimensional Voting Model: Candidate Motivations, Uncertainty, and Convergence. American Journal of Political Science, 1985, 29(1): 69. [9] Aitao Chen, Fuchun Peng, Roy Shan et al. Chinese named entity recognition with conditional probabilistic models. 2006. [10] Jason P. C. Chiu and Eric Nichols. Named Entity Recognition with Bidirectional LSTM-CNNs. Computer Science, 2015. [11] Key Sun Choi, Key Sun Choi and Key Sun Choi. Unsupervised named entity classification models and their ensembles. In: International Conference on Computational Linguistics, 2002: 1–7. [12] Michael Collins. Unsupervised Models for Named Entity Classification. In: Joint Sigdat Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 1999: 100–110. [13] Ronan Collobert, Jason Weston, Michael Karlen et al. Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research, 2011, 12(1): 2493–2537. [14] Tim Cooijmans, Nicolas Ballas, César Laurent et al. Recurrent Batch Normalization. CoRR, 2016, abs/1603.09025. http://arxiv.org/abs/1603.09025. [15] Chuanhai Dong, Jiajun Zhang, Chengqing Zong et al. Character-Based LSTM-CRF with RadicalLevel Features for Chinese Named Entity Recognition. Springer International Publishing, 2016. [16] Radu Florian. Named entity recognition as a house of cards: classifier stacking. In: Conference on Natural Language Learning, 2002: 1–4. [17] Jonas Gehring, Michael Auli, David Grangier et al. Convolutional Sequence to Sequence Learning. 2017. [18] Franck Genet and Franck Genet. Tagging unknown proper names using decision trees. In: Meeting on Association for Computational Linguistics, 2000: 77–84. [19] Yoav Goldberg. The unreasonable effectiveness of Character-level Language Models. [20] Michael Gutmann and Aapo Hyv?rinen. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. Journal of Machine Learning Research, 2010, 9: 297–304. [21] Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka et al. A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. 2016. [22] Kaiming He, Xiangyu Zhang, Shaoqing Ren et al. Deep Residual Learning for Image Recognition. CoRR, 2015, abs/1512.03385. http://arxiv.org/abs/1512.03385. [23] Geoffrey E. Hinton, Alex Krizhevsky and Sida D. Wang. Transforming Auto-Encoders. 2011, 6791: 44–51. [24] Geoffrey E Hinton, Sara Sabour and Nicholas Frosst. Matrix capsules with EM routing. In: International Conference on Learning Representations, 2018. https://openreview.net/forum?id= HJWLfGWRb. [25] Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. 2015: 448–456. [26] Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. CoRR, 2015, abs/1502.03167. http://arxiv.org/abs/1502. 03167. [27] Armand Joulin, Edouard Grave, Piotr Bojanowski et al. Bag of Tricks for Efficient Text Classification. 2016: 427–431. [28] Yoon Kim, Yacine Jernite, David Sontag et al. Character-Aware Neural Language Models. Computer Science, 2015. [29] Trausti Kristjansson, Aron Culotta, Paul Viola et al. Interactive Information Extraction with Constrained Conditional Random Fields. In: Nineteenth National Conference on Artificial Intelligence, Sixteenth Conference on Innovative Applications of Artificial Intelligence, July 25-29, 2004, San Jose, California, Usa, 2004: 412–418. [30] Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian et al. Neural Architectures for Named Entity Recognition. CoRR, 2016, abs/1603.01360. http://arxiv.org/abs/1603.01360. [31] César Laurent, Gabriel Pereyra, Philémon Brakel et al. Batch Normalized Recurrent Neural Networks. 2015: 2657–2661. [32] Nicholas Leonard. Language modeling a billion words. [33] Dongyun Liang, Weiran Xu, Yinge Zhao et al. Combining Word-Level and Character-Level Representations for Relation Classification of Informal Text. In: The Workshop on Representation Learning for Nlp, 2017: 43–47. [34] Xuezhe Ma and Eduard Hovy. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. 2016. [35] Andrei Mikheev, Claire Grover and Marc Moens. Description Of The Ltg System Used For Muc-7. In: 1998. [36] Tomas Mikolov, Kai Chen, Greg Corrado et al. Efficient Estimation of Word Representations in Vector Space. Computer Science, 2013. [37] Andriy Mnih and Yee Whye Teh. A fast and simple algorithm for training neural probabilistic language models. 2012: 419–426. [38] Richard Morgan, Roberto Garigliano, Paul Callaghan et al. University of Durham: description of the LOLITA system as used in MUC-6. In: Conference on Message Understanding, 1995: 71–85. [39] A?ron van den Oord, Sander Dieleman, Heiga Zen et al. WaveNet: A Generative Model for Raw Audio. CoRR, 2016, abs/1609.03499. http://arxiv.org/abs/1609.03499. [40] Jeffrey Pennington, Richard Socher and Christopher Manning. Glove: Global Vectors for Word Representation. In: Conference on Empirical Methods in Natural Language Processing, 2014: 1532– 1543. [41] Tran Quan, Andrew Mackinlay and Antonio Jimeno Yepes. Named Entity Recognition with stack residual LSTM and trainable bias decoding. 2017. [42] Lisa F Rau. Extracting company names from text. In: Artificial Intelligence Applications, 1991. Proceedings., Seventh IEEE Conference on, 1991: 29–32. [43] Lisa F Rau. Method for extracting company names from text. US, 1994. [44] Nils Reimers and Iryna Gurevych. Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP). Copenhagen, Denmark, 2017-09: 338–348. http://aclweb.org/anthology/D17-1035. [45] Sara Sabour, Nicholas Frosst and Geoffrey E Hinton. Dynamic Routing Between Capsules. 2017. [46] Samuel L. Smith, Pieter-Jan Kindermans and Quoc V. Le. Don’t Decay the Learning Rate, Increase the Batch Size. CoRR, 2017, abs/1711.00489. http://arxiv.org/abs/1711.00489. [47] Rohini Srihari, Niu Cheng and Li Wei. A Hybrid Approach for Named Entity and Sub-Type Tagging. In: Applied Natural Language Processing Conference, 2000: 247–254. [48] Rupesh Kumar Srivastava, Klaus Greff and Jürgen Schmidhuber. Training very deep networks. Computer Science, 2015. [49] Emma Strubell, Patrick Verga, David Belanger et al. Fast and Accurate Sequence Labeling with Iterated Dilated Convolutions. CoRR, 2017, abs/1702.02098. http://arxiv.org/abs/1702. 02098. [50] Chen Sun, Abhinav Shrivastava, Saurabh Singh et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era. CoRR, 2017, abs/1707.02968. http://arxiv.org/abs/1707.02968. [51] Ashish Vaswani, Noam Shazeer, Niki Parmar et al. Attention Is All You Need. 2017. [52] A Waibel, T Hanazawa, G Hinton et al. Phoneme recognition using time-delay neural networks. IEEE Press, 1990: 328–339. [53] Haochang Wang, Tiejun Zhao and Jianmiao Liu. Multi-Agent Classifiers Fusion Strategy for Biomedical Named Entity Recognition, 2008: 311–315. [54] Dekai Wu, Grace Ngai and Marine Carpuat. A Stacked, Voted, Stacked Model for Named Entity Recognition. In: Conference on Natural Language Learning at Hlt-Naacl, 2003: 200–203. [55] Zichao Yang, Diyi Yang, Chris Dyer et al. Hierarchical Attention Networks for Document Classification. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2017: 1480–1489. [56] Fisher Yu and Vladlen Koltun. Multi-Scale Context Aggregation by Dilated Convolutions. CoRR, 2015, abs/1511.07122. http://arxiv.org/abs/1511.07122. [57] Suxiang Zhang, Juan Wen and Xiaojie Wang. Word Segmentation and Named Entity Recognition for SIGHAN Bakeoff3. 2006. [58] Xiang Zhang and Yann LeCun. Text Understanding from Scratch. CoRR, 2015, abs/1502.01710. http://arxiv.org/abs/1502.01710. [59] Xiang Zhang, Junbo Zhao and Yann Lecun. Character-level Convolutional Networks for Text Classification. 2015: 649–657. [60] Yimin Zhang and Joe F. Zhou. A trainable method for extracting Chinese entity names and their relations. In: The Workshop on Chinese Language Processing: Held in Conjunction with the Meeting of the Association for Computational Linguistics, 2000: 66–72. [61] Junsheng Zhou, Liang He, Xinyu Dai et al. Chinese Named Entity Recognition with a Multi-Phase Model. 2012. [62] ZHOU, Junsheng, Weiguang et al. Chinese Named Entity Recognition via Joint Identification and Categorization. Chinese Journal of Electronics, 2013. [63] 冯元勇, 孙乐, 张大鲲 et al. 基于小规模尾字特征的中文命名实体识别研究. 电子学报, 2008, 36(9): 1833–1838. [64] 黄德根, 马玉霞 and 杨元生. 基于互信息的中文姓名识别方法. 大连理工大学学报, 2004, 44(5): 744–748. [65] 季姮 and 罗振声. 基于反比概率模型和规则的中文姓名自动辨识系统. In: 全国计算语言学联合学术会议, 2001. [66] 季姮 and 罗振声. 基于统计和规则的中文姓名自动辨识. 语言文字应用, 2001, (1): 14–18. [67] 孙茂松, 黄昌宁, 高海燕 et al. 中文姓名的自动辨识. 中文信息学报, 1995, 9(2): 16–27. [68] 孙茂松 and 邹嘉彦. 汉语自动分词研究评述. 当代语言学, 2001, 3(1): 22–32. [69] 向晓雯, 史晓东 and 曾华琳. 一个统计与规则相结合的中文命名实体识别系统. 计算机应用, 2005, 25(10): 2404–2406. [70] 张小衡 and 王玲玲. 中文机构名称的识别与分析. 中文信息学报, 1997, 11(4): 21–32. [71] 郑家恒, 李鑫 and 谭红叶. 基于语料库的中文姓名识别方法研究. 中文信息学报, 2000, 14(1): 7–12. [72] 周俊生, 戴新宇, 尹存燕 et al. 基于层叠条件随机场模型的中文机构名自动识别. 电子学报, 2006, 34(5): 804–809. ﹀
馆藏号：	017/M2018(372)
公开日期：	2021-05-26

基于笔画的中文字向量模型设计与研究.赵浩新

链接

题名：	基于笔画的中文字向量模型设计与研究
姓名：	赵浩新
学号：	1501211040
论文语种：	chi
专业：	专业学 - 工程 - 软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-26
外文题名：	Design and research of Chinese Word Embedding Model Based on Strokes
关键词：	笔画字向量 CBOW Skip-gram
外文关键词：	Stroke Word Embedding CBOW Skip-gram
论文摘要：	︿数据表示是机器学习领域的基础问题。在机器学习任务中，第一步即输入样本数字化。不同于声音、图像、视频等数字信号，自然语言具有先天的高度结构化、抽象化的特点。因此自然语言任务的首要任务便是将语言文字数字化。随着技术的发展，语言文字的表征方式不断进步。从最初始的one-hot到如今的分布式表示，词向量包含的信息愈加的丰富。现有的统计模型对于未登录词、低频词依然无法有效的表征。中文词向量研究受限于中文汉字特有的“象形”特征，尚没有一种有效利用笔画信息方法。本文通过研究word2vec的CBOW框架，提出了一种基于笔画的汉字字向量模型，通过研究笔画组合构造汉字的规律，为中文未登录字、低频字等构造高质量的字向量。模型使用了以下方法：依靠当前汉字的上下文信息，将笔画向量化，学习笔画组合构造汉字的规律；引入注意力机制，丰富笔画构字的规律；采用CNN模型，捕捉汉字部件、合体字信息。与此同时，论文借鉴了生成对抗网络的思想，基于word2vec的Skip-gram模型，尝试以对抗的方式将笔画信息加入到字向量中。测评工作是对比模型产生的字向量与word2vec、glove产生的字向量在中文分词、命名实体等任务上的准召率。其中在命名实体识别任务中，字向量F1值为81.6%，word2vec、glove分别为80.2%、81.2%。在分词任务中，分别为：96.23%，96.30%、96.31%。分析表明，论文提出的模型可以有效的捕捉汉字笔画信息，并且有以下两点创新：使用CNN模型捕捉笔画构造汉字规律；引入Attention，计算笔画对汉字的贡献度。﹀
外文摘要：	︿ Data representation is a basic question in Machine Learning. The first step when I come up with a ML task is to digitize the sample data. Being different with the voice、image、video data, natural language is inherently highly structured and abstract. Therefore, the primary task of the natural language task is to digitize the language. As the development of technology, the representation skill of natural language improves a lot. From one-hot to the distribution representation, the information that word embedding contains is much richer. However, the existing statistical models cannot effectively represent unregistered words and low-frequency words. There isn’t an effective way to use strokes information to digitize the Chinese word, as for the limitation by pictographic" characteristics to Chinese. We propose a novel model that is Chinese word embedding model based on stroke combination, according to the CBOW. We aim to provide high quality words embedding for the unseen and low-frequency words through studying the rules of Chinese word. The Stroke2Vec model has following innovations: using context information to digitize strokes, learning the rules of Chinese word combinations, enriching the patterns of strokes by attention mechanism and convolutional neural networks. Then we test our models by comparing the results among our model、Word2Vec and GloVe on Named Entity Recognition、Chinese Word Segmentation、Part-Of-Speech tasks. In NER task, F1- scores are 81.6%, 80.2%, 81.2%. In CWS task， F1-scores are 96.23%, 96.30%, 96.31%. Meanwhile inspired by the GAN, we expand the Skip-gram model of word2vec that try to represent word vector by using strokes information during training. ﹀
分类号：	TP3
论文总页数：	52
参考文献总数：	47
参考文献列表：	︿ [1] 石纯一, 黄昌宁, 王家廞. 人工智能原理[M]. 清华大学出版社, 1993. [2] 常宝宝. 自然语言分析与生成术语简介[J]. 产品安全与召回, 2010(4):19-22. [3] 张钹. 自然语言处理的计算模型[J]. 中文信息学报, 2007, 21(3):3-7. [4] Goodstein R L, Harris Z. Mathematical Structures of Language[J]. Mathematical Gazette, 1970, 54(388):173. [5] Bengio Y, Vincent P, Janvin C. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003, 3(6):1137-1155. [6] Mnih A, Hinton G. A scalable hierarchical distributed language model[C]// International Conference on Neural Information Processing Systems. Curran Associates Inc. 2008:1081-1088. [7] Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network based language model[C]// INTERSPEECH 2010, Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September. DBLP, 2010:1045-1048. [8] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[J]. Computer Science, 2013. [9] Levy O, Goldberg Y. Neural word embedding as implicit matrix factorization[J]. Advances in Neural Information Processing Systems, 2014, 3:2177-2185. [10] Goldberg Y, Levy O. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method[J]. Eprint Arxiv, 2014. [11] Ji S, Yun H, Yanardag P, et al. WordRank: Learning Word Embeddings via Robust Ranking[J]. Computer Science, 2015. [12] CAO, S.; LU, W.. Improving Word Embeddings with Convolutional Feature Learning and Subword Information. AAAI Conference on Artificial Intelligence, North America, feb. 2017. [13] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[J]. Computer Science, 2013. [14] Bojanowski P, Grave E, Joulin A, et al. Enriching Word Vectors with Subword Information[J]. 2016. [15] Mikolov T A. Statistical Language Models Based on Neural Networks[J]. 2012. [16] Pinter Y, Guthrie R, Eisenstein J. Mimicking Word Embeddings using Subword RNNs[J]. 2017. [17] Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation[C]// Conference on Empirical Methods in Natural Language Processing. 2014:1532-1543. [18] Chen X, Xu L, Liu Z, et al. Joint learning of character and word embeddings[C]// International Conference on Artificial Intelligence. AAAI Press, 2015:1236-1242. [19] Lecun Y. LeNet-5, convolutional neural networks[J]. [20] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324. [21] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[J]. Computer Science, 2014. [22] 许慎. 说文解字校订本[M]. 凤凰出版社, 2004. [23] Luong M T, Pham H, Manning C D. Effective Approaches to Attention-based Neural Machine Translation[J]. Computer Science, 2015. [24] Lin, Z., Feng, M., Santos, C. N. dos, Yu, M., Xiang, B., Zhou, B., & Bengio, Y. (2017). A Structured Self-Attentive Sentence Embedding. In ICLR 2017. [25] Parikh, A. P., T?ckstr?m, O., Das, D., & Uszkoreit, J. (2016). A Decomposable Attention Model for Natural Language Inference. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing [26] Cheng, J., Dong, L., & Lapata, M. (2016). Long Short-Term Memory-Networks for Machine Reading. arXiv Preprint arXiv:1601.06733. [27] Paulus, R., Xiong, C., & Socher, R. (2017). A Deep Reinforced Model for Abstractive Summarization. [28] Daniluk, M., Rockt, T., Welbl, J., & Riedel, S. (2017). Frustratingly Short Attention Spans in Neural Language Modeling. In ICLR 2017. [29] Liu, Y., & Lapata, M. (2017). Learning Structured Text Representations. In arXiv preprint arXiv:1705.09207. [30] 梁南元. 书面汉语的自动分词与一个自动分词系统—CDWS[J]. 北京航空航天大学学报, 1984(4):101-108. [31] 张华平, 刘群. 基于N-最短路径方法的中文词语粗分模型[J]. 中文信息学报, 2002, 16(5):1-7. [32] Meishan Zhang, Yue Zhang, and Guohong Fu. Transition-based neural word segmentation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2016, pp. 421–431. [33] Xue N, Shen L. Chinese Word Segmentation as LMR Tagging[J]. Proc of Sighan Workshop, 2003:176--179. [34] Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015. [35] 李航. 统计学习方法[M]. 清华大学出版社, 2012. [36] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780. [37] Cho K, Van Merri?nboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[J]. arXiv preprint arXiv:1406.1078, 2014. [38] Gers F A, Schmidhuber J. Recurrent nets that time and count[C]//Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on. IEEE, 2000, 3: 189-194. [39] Yao K, Cohn T, Vylomova K, et al. Depth-gated recurrent neural networks[J]. arXiv preprint, 2015. [40] 黄昌宁, 赵海. 中文分词十年回顾[J]. 中文信息学报, 2007, 21(3):8-19. [41] Gehring J, Auli M, Grangier D, et al. Convolutional sequence to sequence learning[J]. arXiv preprint arXiv:1705.03122, 2017. [42] Kalchbrenner N, Grefenstette E, Blunsom P. A convolutional neural network for modelling sentences[J]. arXiv preprint arXiv:1404.2188, 2014. [43] Kim Y. Convolutional neural networks for sentence classification[J]. arXiv preprint arXiv:1408.5882, 2014. [44] Hu B, Lu Z, Li H, et al. Convolutional neural network architectures for matching natural language sentences[C]//Advances in neural information processing systems. 2014: 2042-2050. [45] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in neural information processing systems. 2014: 2672-2680. [46] Goodfellow I. NIPS 2016 tutorial: Generative adversarial networks[J]. arXiv preprint arXiv:1701.00160, 2016. [47] Cao S, Lu W, Zhou J, et al. cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information[J]. 2018. ﹀
馆藏号：	017/M2018(401)
公开日期：	2018-05-26

英语智能写作个性化辅助系统的设计与实现.赵恩辉

链接

题名：	英语智能写作个性化辅助系统的设计与实现
姓名：	赵恩辉
学号：	1501210804
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-26
外文题名：	DESIGN AND IMPLEMENTATION OF ENGLISH INTELLIGENT PERSONALIZED ASSISTANT WRITING SYSTEM
关键词：	词网络句子推荐文章推荐英语写作分级英语写作辅助系统
外文关键词：	Vocabulary Network Sentence Recommendation Article Recommendation English Writing Level Computer-assisted English Writing
论文摘要：	︿在生活工作的交流沟通和英语学习中，英语写作起的作用越来越重要。一方面只有丰富、准确的描述文章内容才能有效的传递思想和信息；另一方面对于母语为非英语的英语学习者来说，写作也可以提高英语水平，大量写作这也是“写长法”英语教学理论的基本要求。但是写作对于英语学习者来说却是一件很难的事情，针对写作困难的问题，出现了很多辅助写作系统。区别于这些系统，本系统是基于学生个人学习状况和写作水平，从词、句子、篇章多个维度进行帮助写作的个性化辅助系统。词辅助模块首先建立多种词典，抽取词与词之间的多种关系构建词网络，然后基于句法分析，文章关键词技术和学生历史写作情况选取待扩展词，基于词网络关系推荐候选词，对待扩展词进行替换。蓝思（Lexile）在英语分级时，仅考虑了词频和句子长度，然后做了回归模型。本系统在判断学生英语写作水平和篇章难度时，不仅考虑了篇章的平均句子结构复杂度，平均句子长度，篇章词频取对数求平均值，句子数量，总单词数，最大单句结构复杂度，最长句子长度七个维度，并将这些文本特征与文本内容结合，通过自然语言处理和深度学习的算法模型及其他相关技术实现。句子辅助模块，首先判断学生的英语写作水平，针对学生的写作水平推荐给学生适合其难度的句子，推荐句子时主要基于短文本相似性的相关技术实现，通过推荐参考例句扩展写作思路，提高句子表达的准确性和多样性。文章辅助模块，首先判断学生英语写作水平和文章难度，推荐与学生英语写作水平一致的相似主题的写作范文，来扩展写作思路，推荐主题范文时，主要通过计算主题的相似性来实现。本文也实现了词模块，句子模块以及篇章处理模块功能界面，并描述了各模块语料库的组成和构建过程。通过待扩展词采纳率对词模块进行了测评，采纳率为71.56%；通过人工标注数据对写作难度分级进行了准确率的测评，准确率为93.7%；通过人工标注数据对句子模块的准确度做了测评，取1个相似句命中准确率为84.83%，取6个相似句命中准确率为97%；通过推荐同主题范文的点击率对篇章模块做了测评，点击率为29.375%。本文也给出了各系统模块及系统架构的整体设计，并通过人工打分的方式对系统做了评估，总分5分，各方面整体评价在3到4分之间。这些指标体现了本系统各方面功能的精确程度，是系统功能和个性化准确程度的最主要影响因素。﹀
外文摘要：	︿ English writing plays an increasingly important role in daily life, especially in work communication and English learning. On the one hand, it is necessary to enrich and accurately describe the contents of the article to convey ideas and information, on the other hand, writing is the most important strategy in improving English for non-native English speakers. And a lot of writing is an essential basic requirement of "Length Approach" that is an English teaching theory. However, writing is a very difficult thing for English learners, there are many auxiliary writing software and systems to solve the problem for writing difficulties. Different from these systems and software, this system is based on every individual learning status and writing level of the students. It is a personalized Writing Assistant System that helps students to write from word level to sentence level and topic level. In the word module, firstly, we construct kinds of dictionaries, extract relations between words and establish a vocabulary network based that. Then select expansion words to be replaced based on syntactic analysis, the keyword technology and students’ log of writing. At last, we select candidate words to replace the extended words according to the vocabulary network. The Lexile of a text is established through a regression model that just considers sentence length and word frequency. Unlike that, when determining the students' English writing ability and the level of a text, we consider 7 aspects: the average sentence structural complexity of the article, the average sentence length, the average word frequency of the article, the sentences number of the article, the total words number of the article, and the structural complexity of the longest sentence in the article, the total words number of the longest sentence in the article. We combine these features with textual content to implement the function through natural language processing and deep learning algorithm. In the sentence module, firstly, we determine the level of the students' English writing and the level of every sentence in the database, then recommend the sentences which suit students to broad their mind and improve the accuracy and vividness of writing that is implemented based on the similarity calculation of short text. In the article module, firstly, we determine the level of the students' English writing and the level of every document in the database, then recommend the documents which suit students to broad their mind and improve vividness of writing that is implemented based on the similarity calculation of text topic. This system provides interface for users to use each module. We describe the construction process of corpus of each module, the overall design of each module and the architecture of system in this paper. The word module is evaluated with adoption rate of words which are expanded, and the adoption rate is 71.56%. The writing level is evaluated with accuracy, and the accuracy is 93.7%. The sentence module is evaluated with accuracy, and the accuracy is 84.83% with selection of top 1 similar sentence and 97 % with selection of top 6. The article module is evaluated with click rate, and the click rate is 29.375%. We also design the evaluation of system performance with users’ manual scoring, and most aspects score 3 to 4 points. ﹀
分类号：	TP3
论文总页数：	73
参考文献总数：	45
参考文献列表：	︿ [1] 王初明.论外语“写长法”的教学理念[A].北京:中央编译出版社, 2002. [2] 袁秀凤.近十年英语“写长法”教学模式研究综述[J] .宁德师范学院学报(哲学社会科学版) , 2013 (3) :108-111. [3] 占飞. 计算语言学领域英文辅助写作系统[D]. 哈尔滨工业大学, 2011. [4] Chen M H, Huang S T, Hsieh H T, et al. FLOW: A First-Language-Oriented Writing Assistant System[C]//Proceedings of the ACL 2012 System Demonstrations. Association for Computational Linguistics, 2012: 157-162. [5] 孔行. 基于主题推荐的辅助写作系统[D]. 哈尔滨工业大学, 2015. [6] 吴伟成,周俊生,曲维光. 基于统计学习模型的句法分析方法综述[J]. 中文信息学报, 2013 , 27(3): 9?19. [7] Quattoni A, Wang S, Morency L, et al. Hidden conditional random fields[J]. IEEE Trans. PAMI 29(10),1848–1852 (2007). [8] Page L, Brin S, Motwani R, et al. The PageRank Citation Ranking: Bringing Order to the Web[R]. Technical report, Stanford Digital Library Technologies Project,1998. [9] Mihalcea R, Tarau P. TextRank: bringing order into texts[C]// Proc Conference on Empirical Methods in Natural Language Processing,2004:404-411. [10] 刘知远. 基于文档主题结构的关键词抽取方法研究[R].清华大学, 2011. [11] Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model[J]. Journal of Machine Learning Research,3:1137-1155,2003. [12] Mikolov T, Sutskever I,Chen K, et al. Distributed Representations of Words and Phrases and their Compositionality[C]//International Conference on Neural Information Processing Systems,2013:3111-3119. [13] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[J]. Computer Science,2013. [14] Morin F, Bengio Y. Hierarchical Probabilistic Neural Network Language Model[J]. Aistats, 2005. [15] Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C]// Empirical Methods in Natural Language Processing (EMNLP), 2014 :1532-1543. [16] Le Q, Mikolov T. Distributed Representations of Sentences and Documents[C]// International Conference on International Conference on Machine Learning,2014: II-1188-II-1196. [17] Tsoi A C, Tan S. Recurrent neural networks: A constructive algorithm, and its properties[J]. Neurocomputing.1997,15 (3–4) :309-326. [18] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation,1997,9 (8) :1735-1780. [19] Dey R, Salem F M. Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks[C]//IEEE International Midwest Symposium on Circuits & Systems,2017 :1597-1600. [20] Lipton Z C, Berkowitz J, Elkan C. A Critical Review of Recurrent Neural Networks for Sequence Learning[J]. Computer Science,2015. [21] Mikolov T, Kombrink S, Burget L, et al. Extensions of recurrent neural network language model[C]//IEEE International Conference on Acoustics,2011, 125 (3) :5528-5531. [22] Neculoiu P, Versteegh M, Rotaru M. Learning Text Similarity with Siamese Recurrent Networks[C]//Repl4nlp Workshop at Acl,2016. [23] Mueller J, Thyagarajan A. Siamese Recurrent Architectures for Learning Sentence Similarity[C]//Thirtieth Aaai Conference on Artificial Intelligence,2016 :2786-2792. [24] Sutskever I, Vinyals O, Le Q. Sequence to Sequence Learning with Neural Networks[C]//Neural Information Processing Systems,2014. [25] Chung J, Gulcehre C, Cho K H, et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[J]. Eprint Arxiv,2014. [26] Lowe R, Pow N, Serban I, et al. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems[J]. Computer Science,2015. [27] Rush AM, Chopra S,Weston J. A Neural Attention Model for Abstractive Sentence Summarization[J]. Computer Science,2015. [28] Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation[J]. J Machine Learning Research Archive,2003,3 :993-1022. [29] 张龙凯,王厚峰.文本摘要中的句子抽取方法研究[J].中国计算语言学研究前沿进展，2011. [30] Erkan, Radev, Dragomir R. LexRank: graph-based lexical centrality as salience in text summarization[J]. Journal of Qiqihar Junior Teachers College,2012,22:2004. [31] Smith M, Turner J, Sanford-Moore E, et al. The Lexile Framework for Reading: An Introduction to What It Is and How to Use It[J]. Springer Singapore,2016. [32] Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016:785-794. [33] Bojanowski P, Grave E, Joulin A, et al. Enriching Word Vectors with Subword Information [J]. arXiv preprint arXiv:1607.04606, 2016. [34] Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[J]. arXiv preprint arXiv:1607.01759,2016. [35] Schuster M, Paliwal KK. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing,2002,45(11):2673-2681. [36] Kim Y. Convolutional Neural Networks for Sentence Classification[J]. Eprint Arxiv. 2014. [37] Ketkar N. Convolutional Neural Networks[J]. Apress,2017. [38] WiKi. WordNet. https://en.wikipedia.org/wiki/WordNet. [39] XOxford University. British National Corpus[DB]. https://corpus.byu.edu/bnc/. [40] Hilary N, Sheena G, Paul T, et al. British Academic Written English Corpus[DB]. https://www.coventry.ac.uk/research/research-directories/current-projects/2015/british-academic-written-english-corpus-bawe/. [41] Manning C D, Surdeanu M, Bauer J, et al. The Stanford CoreNLP Natural Language Processing Toolkit[C]//Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, At Baltimore, Maryland,2014. [42] Corpus of Contemporary American English (COCA) [DB]. https://corpus.byu.edu/coca/. [43] Jurafsky D, Martin J H. Speech and Language Processing[G]. http://web.stanford.edu/~jurafsky/slp3/,2018. [44] Christopher M. Bishop. Pattern Recognition and Machine Learning [M]. Springer,2007. [45] Goodfellow I, Bengio Y, Courville A. Deep Learning [M]. The MIT Press,2016. ﹀
馆藏号：	017/M2018(402)
公开日期：	2018-05-26

基于深度学习的英文手写识别的设计与实现.王文杰

链接

题名：	基于深度学习的英文手写识别的设计与实现
作者：	王文杰
学号：	1501210713
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-26
题目(外文)：	Design and Implementation of English Handwritten Recognition Based on Deep Learning
关键字(中文)：	手写识别卷积神经网络深度学习
关键字(外文)：	Handwritten Recognition Convolutional Neural Networks Deep Learning
文摘：	︿文字是人类进入文明社会的重要标志之一，推动着人类社会的进步和发展。在科技发达的今天，将这些纸上的古老符号转化成现代计算机中能够识别、存储和检索的内容有着重要意义。近些年来，随着深度学习技术的飞速发展，使用计算机对单个英文字符的识别已经达到了极高的准确率。但是，由于个人书写风格的差异、字符之间笔画的粘连等问题，对整个手写英文字符串进行识别仍是一个很有挑战性的问题。本文主要针对手写识别领域两个主要任务——脱机手写识别和联机手写识别，进行了研究和模型设计。在联机手写识别中，对数据集中的点坐标和时间戳等信息提取出了5种特征通道，然后使用CRNN架构设计了深度学习模型。在脱机手写识别数据处理中，采用了随机位置、对比度变换和增加高斯噪声等几种不同数据增强方法；在脱机手写识别模型中，本文设计了基于“CNN+Seq2Seq”模型和基于全卷积的手写识别模型。最后，使用“垂直投影”算法对图片进行分行，将论文中的模型运用到了一个实际的工程项目之中。在联机手写识别中，通过对测试集进行评估，基于CRNN模型达到了CER 5.7%。把不同的特征进行比较，发现“点的转角”这个特征对结果影响最大。在脱机手写识别中，本文基于“CNN+Seq2seq”模型能够达到CER 6.1%。基于全卷积的手写识别模型能达到CER 8.6%，虽然效果比基于“CNN+Seq2Seq”模型略差，但是在训练和预测时间上降低了40%。最后，通过对比不同的数据处理方法的结果，发现随机位移能够很好的防止过拟合，提高模型识别准确率。﹀
文摘（外文）：	︿ Text is one of the important signs that human beings enter the civilized society. It promotes the progress and development of human society. Nowadays, with the development of science and technology, it is of great significance to transform the ancient symbols on these papers into the contents that can be identified, stored and retrieved in modern computers. In recent years, with the rapid development of deep learning, the recognition of single English character by computers has reached a high accuracy rate. However, it is still a challenging problem to recognize the whole handwritten English string due to the differences of personal writing styles and the adhesion of strokes between characters. In this paper, two main tasks in handwritten recognition field, offline handwriting recognition and online handwriting recognition, are studied and designed. In online handwriting recognition, we extract five feature channels from the coordinates and timestamp information of points in the dataset, and then design a deep learning model using CRNN architecture. In off-line handwriting recognition data processing, we use several different data augmentation methods, such as random displacement, contrast transformation and increasing Gaussian noise; In offline handwriting recognition model, we design two models, one based on "CNN + Seq2Seq" model, and the other based on fully convolution model. Finally, we use the "vertical projection" algorithm to divide the pictures, and apply the model in this paper to an actual project. In online handwritten recognition, by evaluating the test set, the model based on "CRNN" can reach 5.7% CER. By comparing the different features, we find that "point of rotation angle" has the greatest influence on the results. In offline handwritten recognition, the model based on "CNN+Seq2seq" can reach 6.1% CER. The handwritten recognition model based on fully convolution can reach 8.6% CER. Although the result is slightly worse than the model based on "CNN+Seq2Seq", the training and predicting time is reduced by 40 %. Finally, by comparing the results of different data processing methods, we find that random displacement can well prevent over-fitting and improve the accuracy of recognition. ﹀
分类号：	TP3
论文总页数：	69
参考文献数：	54
参考文献：	︿ [1] 刘排排. 空中手写字符串识别算法研究[硕士学位论文]. 北京交通大学, 2015. [2] 武裕朴, 赵景台. 印刷体汉字识别方法综述[J]. 机器人, 1981, 3(5):6-12. [3] Jain A K, Duin R P W, Mao J. Statistical pattern recognition: a review[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2000, 22(1):4-37. [4] 王瑞刚. 基于递归神经网络的英文手写输入法的设计与实现[硕士学位论文]. 哈尔滨工业大学, 2016. [5] Bourlard H A, Morgan N. Connectionist Speech Recognition: A Hybrid Approach[M]. Kluwer Academic Publishers, 1993. [6] Graves A, Gomez F. Connectionist temporal classification:labelling unsegmented sequence data with recurrent neural networks[C]// International Conference on Machine Learning. ACM, 2006:369-376. [7] Graves A. Offline handwriting recognition with multidimensional recurrent neural networks[C]// International Conference on Neural Information Processing Systems. Curran Associates Inc. 2008:545-552. [8] Pham V, Bluche T, Kermorvant C, et al. Dropout Improves Recurrent Neural Networks for Handwriting Recognition[J]. Eprint Arxiv, 2014:285-290. [9] Bluche T, Messina R. Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition[C]// Iapr International Conference on Document Analysis and Recognition. IEEE Computer Society, 2017:646-651. [10] Poznanski A, Wolf L. CNN-N-Gram for HandwritingWord Recognition[C]// Computer Vision and Pattern Recognition. IEEE, 2016:2305-2314. [11] Shen X, Messina R. A Method of Synthesizing Handwritten Chinese Images for Data Augmentation[C]// International Conference on Frontiers in Handwriting Recognition. IEEE, 2017:114-119. [12] Krishnan P, Jawahar C V. Matching Handwritten Document Images[M]// Computer Vision – ECCV 2016. Springer International Publishing, 2016:766-782. [13] Wigington C, Stewart S, Davis B, et al. Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network[C]// Iapr International Conference on Document Analysis and Recognition. IEEE Computer Society, 2017:639-645. [14] Liwicki M, Bunke H. IAM-OnDB - an On-Line English Sentence Database Acquired from Handwritten Text on a Whiteboard[J]. 2005:956-961. [15] Graves A, Liwicki M, Bunke H, et al. A Novel Connectionist System for Unconstrained Handwriting Recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2009, 31(5):855-868. [16] Keysers D, Deselaers T, Rowley H A, et al. Multi-Language Online Handwriting Recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, PP(99):1-1. [17] Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors[J]. Readings in Cognitive Science, 1988, 323(6088):399-421. [18] Hinton G E, Salakhutdinov R. Reducing the dimensionality of data with neural networks.[J]. Science, 2006, 313(5786): 504-507. [19] Glorot X, Bordes A, Bengio Y. Deep Sparse Rectifier Neural Networks[C]// International Conference on Artificial Intelligence and Statistics. 2012:315-323. [20] Krizhevsky A, Sutskever I, Hinton G E. ImageNet Classification with Deep Convolutional Neural Networks [J]. Advances in Neural Information Processing Systems, 2012, 25(2):2012. [21] Lecun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 2015, 521(7553):436. [22] He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]// Computer Vision and Pattern Recognition. IEEE, 2016:770-778. [23] Silver D, Huang A, Maddison C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587):484. [24] Qian N. On the momentum term in gradient descent learning algorithms[J]. Neural Networks the Official Journal of the International Neural Network Society, 1999, 12(1):145. [25] Duchi J, Hazan E, Singer Y. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization[J]. Journal of Machine Learning Research, 2011, 12(7):257-269. [26] Kingma D, Ba J. Adam: A Method for Stochastic Optimization[J]. Computer Science, 2014. [27] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[J]. Journal of Machine Learning Research, 2010, 9:249-256. [28] Lennie P. The cost of cortical computation[J]. Current Biology Cb, 2003, 13(6):493-7. [29] Lecun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324. [30] Mozer M C. A Focused Backpropagation Algorithm for Temporal Pattern Recognition[M]// Backpropagation. L. Erlbaum Associates Inc. 1995:349-381. [31] Robinson A J, Fallside F. The utility driven dynamic error propagation network[J]. 1987. [32] Werbos P J. Generalization of backpropagation with application to a recurrent gas market model[J]. Neural Networks, 1988, 1(4):339-356. [33] Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult[J]. IEEE Transactions on Neural Networks, 1994, 5(2):157. [34] Hochreiter S, Schmidhuber J. Long short-term memory[M]// Supervised Sequence Labelling with Recurrent Neural Networks. Springer Berlin Heidelberg, 1997:1735-1780. [35] Graves A. Supervised Sequence Labelling with Recurrent Neural Networks[M]. Springer Berlin Heidelberg, 2012. [36] Graves A. Generating Sequences With Recurrent Neural Networks[J]. Computer Science, 2013. [37] Xu J, Chen D, Qiu X, et al. Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification[C]// Conference on Empirical Methods in Natural Language Processing. 2016:1660-1669. [38] Zhou X, Wan X, Xiao J. Attention-based LSTM Network for Cross-Lingual Sentiment Classification[C]// Conference on Empirical Methods in Natural Language Processing. 2016:247-256. [39] Chen H, Sun M, Tu C, et al. Neural Sentiment Classification with User and Product Attention[C]// Conference on Empirical Methods in Natural Language Processing. 2016:1650-1659. [40] Sundermeyer M, Schlüter R, Ney H. LSTM Neural Networks for Language Modeling[C]// Interspeech. 2012:601-608. [41] Tan M, Santos C D, Xiang B, et al. LSTM-based Deep Learning Models for Non-factoid Answer Selection[J]. Computer Science, 2015. [42] Yang Z, Yang D, Dyer C, et al. Hierarchical Attention Networks for Document Classification[C]// Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2017:1480-1489. [43] Shi B, Bai X, Yao C. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2016, [44] Zhang X, Wang M, Wang L, et al. Building Handwriting Recognizers by Leveraging Skeletons of Both Offline and Online Samples[C]// International Conference on Document Analysis and Recognition. IEEE, 2015:406-410.Marti U V, Bunke H. The IAM-database: an English sentence database for offline handwriting recognition[J]. International Journal on Document Analysis & Recognition, 2002, 5(1):39-46. [45] Zeiler M D, Fergus R. Visualizing and Understanding Convolutional Networks[M]// Computer Vision – ECCV 2014. Springer International Publishing, 2014:818-833. [46] Marti U V, Bunke H. The IAM-database: an English sentence database for offline handwriting recognition[J]. International Journal on Document Analysis & Recognition, 2002, 5(1):39-46. [47] Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift[J]. 2015:448-456. [48] Cho K, Merrienboer B V, Gulcehre C, et al. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J]. Computer Science, 2014. [49] Sutskever I, Vinyals O, Le Q V. Sequence to Sequence Learning with Neural Networks[J]. 2014, 4:3104-3112. [50] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[J]. Computer Science, 2014. [51] Dauphin Y N, Fan A, Auli M, et al. Language Modeling with Gated Convolutional Networks[J]. 2016. [52] Espana-Boquera S, Castro-Bleda M J, Gorbe-Moya J, et al. Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2011, 33(4):767-79. [53] Dreuw P, Doetsch P, Plahl C, et al. Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained Gaussian HMM: A comparison for offline handwriting recognition[C]// IEEE International Conference on Image Processing. IEEE, 2011:3541-3544. [54] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative Adversarial Networks[J]. Advances in Neural Information Processing Systems, 2014, 3:2672-2680. ﹀
馆藏号：	017/M2018(403)
公开日期：	2021-05-26

基于机器学习的作文分析系统设计与实现.李海涛

链接

题名：	基于机器学习的作文分析系统设计与实现
作者：	李海涛
学号：	1501210932
语种：	chi
专业：	专业学 - 工程 - 软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-26
题目(外文)：	Composition analysis system based on machine learning
关键字(中文)：	自动打分主题分析机器学习
关键字(外文)：	Automatic scoring Thematic analysis Machine learning
文摘：	︿随着学习英语人数的增多，英语作文批改的工作量急剧增加。本文旨在实现一个将作文自动打分和作文主题分析相结合的英语作文分析系统。通过作文分析系统，帮助作文批改者更直观的了解一篇作文的得分和主题分布情况，更全面准确的对一篇作文进行评估，从而可以节省更多的人力、批改资源等，降低作文批改压力，提高英语教学质量。本文分别设计和实现了作文自动评分系统和作文主题分析系统。作文自动评分系统基于深度学习技术，首先将作文进行向量化表示，然后基于神经网络训练得到作文分值预测模型，最后使用预测模型来对一篇未知分数的作文进行预测打分。实验时使用的模型主要有卷积神经网络（Convolutional Neural Network，CNN）、长短时记忆单元网络（Long Short-Term Memory，LSTM）、双向长短时记忆单元网络（Bi-directional Long Short-Term Memory，BiLSTM）等。模型最终的效果使用二次加权卡帕值（Quadratic Weighted Kappa）进行评估。通过调整模型参数，对模型进行迭代训练，最终Kappa值平均最高达到0.96左右，说明模型的评分结果与人工评分结果有比较高的一致性。通过将模型应用到实际场景中，实现了有较好效果的作文自动评分系统。作文主题分析模块包括两部分，第一部分是对同一题目下多篇作文之间的主题相关性分析，第二部分是对同一篇作文段落之间主题连续性的分析。第一部分首先基于词频-逆文档频率（Term Frequency–Inverse Document Frequency，TF-IDF）清除作文中的高频词，然后使用文档向量（Documents Vector，Doc2Vec）的方法，将一篇文章表示成多维向量，并映射到二维坐标空间中。通过文档向量之间的关系，找出这些作文中的优秀作文和偏离题目的作文。第二部分将作文段落表示成向量，映射到二维和一维空间中，表示出段落之间按空间和按时序的分布情况，判断段落的主题分布是否具有连续性。实验中通过大量的实例验证了作文主题分析系统的有效性和实用性，该部分最终使用坐标图的形式对结果进行展示。﹀
文摘（外文）：	︿ With the increase in the number of students studying English, the workload for the correction of English composition has increased dramatically.This article aims to implement an English composition analysis system that combines the automatic scoring of the composition with the topic analysis of the composition.Through the composition analysis system, the authors are assisted in understanding the scores and the distribution of topics in a composition more intuitively, and more comprehensively and accurately evaluate a composition, thereby saving more manpower, correcting resources, and reducing the pressure of composition correction. Improve the quality of English teaching. This paper designs and implements a compositional automatic scoring system and a composition topic analysis system.The compositional automatic scoring system is based on deep learning technology. It first expresses the composition in a vectorized manner, then obtains a composition score prediction model based on neural network training, and finally uses the prediction model to score an unknown score composition.The models used in the experiment mainly include Convolutional Neural Network(CNN), Long Short-Term Memory(LSTM), and Bi-directional Long Short-Term Memory(BiLSTM) etc.The final effect of the model was evaluated using the Quadratic Weighted Kappa. By adjusting the model parameters, the model is trained iteratively, and the final Kappa value reaches an average of about 0.96. This shows that the score of the model has a high consistency with the manual score. By applying the model to the actual scene, an automatic scoring system with better performance is achieved. The topic analysis module consists of two parts. The first part is the analysis of the topic relatedness between several articles under the same topic. The second part is the analysis of the topic continuity between the same articles.The first part uses Term Frequency–Inverse Document Frequency(TF-IDF) to clear the high-frequency words in the composition. Using the Document Vector(Doc2Vec) method, an article is expressed as a multi-dimensional vector and mapped into a two-dimensional coordinate space.Through the relationship between the document vectors, find out the excellent compositions in these compositions and the off-topic compositions.In the second part, the composition passages are represented as vectors, which are mapped into two-dimensional and one-dimensional space, and show the distribution of paragraphs by space and time series, and determine whether the topic distribution of paragraphs is continuous.In the experiment, a large number of examples were used to verify the validity and practicability of the composition analysis system of the composition. This part finally uses the form of a coordinate graph to display the results. ﹀
分类号：	TP3
论文总页数：	87
参考文献数：	42
参考文献：	︿ [1] 李丽生. 英语的全球化与语言的多样性[J]. 云南师范大学学报(哲学社会科学版),2005, 37(1): 8-12. [2] Shermis, M. D, Burstein, J. Automated Essay Scoring: A Cross Disciplinary Perspective[M]. Mahwah, NJ: Lawrence Erlbaum Associates, 2003:65-83. [3] 梁文婷. 汉语文本主题分析技术的研究与实现[硕士学位论文]. 重庆大学, 2008. [4] 宗成庆. 统计自然语言处理[M]. 清华大学出版社, 2008:148-153. [5] 陈潇潇, 葛诗利. 自动作文评分研究综述[J]. 解放军外国语学院学报, 2008, 31(5):78-83. [6] Page E B. Computer grading of student prose, using modern concepts and software[J]. The Journal of experimental education, 1994, 62(2): 127-142. [7] Page E B. Project essay grade: PEG[J]. Automated essay scoring: A cross-disciplinary perspective, 2003: 43-54. [8] Landauer T.K, Laham D, Foltz P.W. The Intelligent Essay Assessor[J]. IEEE Intelligent Systems, 2000,15(5):27-31. [9] Semire Dikli. Automated Essay Scoring[J].Turish Online Journal of Distance Education, 2006,7(1):49-58. [10] Attali, Y, Burstein, J. Automated Essay Scoring With e-rater? V.2[J].Journal of Technology, Learning, and Assessment, 2006, 4(3):4-15 [11] 梁茂成, 文秋芳. 国外作文自动评分系统评述及启示[J]. 语言技术与外语教学研究, 2007,117:18-22. [12] Rudner L, Garcia V, Welch C. An evaluation of intellimetricTM essay scoring system using responses to gmatawa prompts[J]. Retrieved August, 2005, 9: 2006. [13] Rudner L M, Liang T. Automated essay scoring using Bayes' theorem[J]. The Journal of Technology, Learning and Assessment, 2002, 1(2). [14] 梁茂成.中国学生英语作文自动评分模型的构建[博士学位论文].南京大学,2005. [15] 李亚男.汉语作为第二语言测试的作文自动评分研究[博士学位论文].北京语言大学,2006. [16] 曹亦薇, 杨晨.使用潜在语义分析的汉语作文自动评分研究[J].考试研究,2007,3(1):63-71. [17] Alikaniotis D, Yannakoudakis H, Rei M. Automatic Text Scoring Using Neural Networks[J]. 2016:715-725. [18] Taghipour K, Ng H T. A Neural Approach to Automated Essay Scoring[C]// Conference on Empirical Methods in Natural Language Processing. 2016. [19] H Nguyen, L Dery. Neural Networks for Automated Essay Grading[C]. America:Stanford University, 2016. 1-11 [20] Kim Y. Convolutional Neural Networks for Sentence Classification[J]. Eprint Arxiv, 2014. [21] Zhou C, Sun C, Liu Z, et al. A C-LSTM Neural Network for Text Classification[J]. Computer Science, 2015, 1(4):39-44. [22] Rachit Arora. Latent Dirichlet Allocation Based Multi-Document Summarization[C].Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data,2008: 91-97. [23] 胡珀, 张勇, 李鹏. 第五届全国青年计算语言学研讨会报道[J]. 中文信息学报,2010, 24(6): 91-91. [24] Zhang S, Luo J, Liu Y, Yao D, Tian Y. Hotspots Detection on Microblog[J].Multimedia Information Networking & Security Fourth International Conference on IEEE, 2012: 922-925. [25] 孙志军, 薛磊, 许阳明,等. 深度学习研究综述[J]. 计算机应用研究, 2012, 29(8):2806-2810. [26] 刘明吉, 王秀峰. 数据挖掘中的数据预处理[J]. 计算机科学, 2000, 27(4):54-57. [27] 刘建阳. 英文作文自动评分算法研究及系统实现[硕士学位论文]. 北京大学, 2014. [28] Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[J]. 2016:427-431. [29] 李彦冬, 郝宗波, 雷航. 卷积神经网络研究综述[J]. 计算机应用, 2016, 36(9):2508-2515. [30] Sundermeyer M, Schlüter R, Ney H. LSTM Neural Networks for Language Modeling[C]// Interspeech. 2012:601-608. [31] Soutner D, Müller L. Application of LSTM Neural Networks in Language Modelling[M]// Text, Speech, and Dialogue. Springer Berlin Heidelberg, 2013:105-112. [32] Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF Models for Sequence Tagging[J]. Computer Science, 2015. [33] Wen Y, Zhang W, Luo R, et al. Learning text representation using recurrent convolutional neural network with highway layers[J]. 2016. [34] Kurbiel T, Khaleghian S. Training of Deep Neural Networks based on Distance Measures using RMSProp[J]. 2017. [35] Nagahama K. Sentence structure analysis for legal documents using distributed representation of the words[J]. European Journal of Organic Chemistry, 2015, 2006(18):4164-4169. [36] Le Q V, Mikolov T. Distributed Representations of Sentences and Documents[J]. 2014, 4:II-1188. [37] 周诗咏. Web环境下基于语义模式匹配的实体关系提取方法的研究[硕士学位论文]. 东北大学, 2009. [38] 张振亚, 王进, 程红梅,等. 基于余弦相似度的文本空间索引方法研究[J]. 计算机科学, 2005, 32(9):160-163. [39] 曹国媛. 基于PLSA的大学英语作文自动评分模型研究[硕士学位论文]. 桂林电子科技大学, 2012. [40] 范涛松. 基于神经网络的英语作文自动评分模型研究与实现[硕士学位论文]. 桂林电子科技大学, 2014. [41] 周兴. 基于深度学习的谣言检测及模式挖掘[硕士学位论文]. 中国科学院大学, 2017. [42] 杨国花. 英语作文主题分析与观点挖掘模型的研究[硕士学位论文]. 桂林电子科技大学, 2015. ﹀
馆藏号：	017/M2018(408)
公开日期：	2021-05-26

基于深度学习的英语语法纠错系统的设计与实现.陈宏业

链接

题名：	基于深度学习的英语语法纠错系统的设计与实现
姓名：	陈宏业
学号：	1501210881
论文语种：	chi
专业：	专业学 - 工程 - 软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-26
关键词：	深度学习语法错误纠正 Attention seq2seq
论文摘要：	︿目前，英语依然是全球使用最多的语言之一，英语学习者的数量与日俱增。随着互联网的发展，知识的共享，学生学习英语已不局限于在课堂上接受老师的教学。能自动对语法错误进行检测和纠正的工具，可以有效帮助学生提升学习的效率与自学进度，也可以减轻老师教学的负担。在国内，句酷批改网可以对学生作文进行评分，根据他们的语料库和提取出的语法模式进行错误检测，但是对于大多数语法错误它并不能给出修改建议和修改结果。英语语法纠错系统能直接对错误进行纠正，弥补了这一缺陷。在深度学习取得突破性进展前，人们主要使用传统机器学习方法和统计机器翻译方法来解决此问题。随着神经机器翻译模型在翻译任务中取得很大成功的同时，它也促进了语法错误纠正任务的发展。但大部分的研究集中在encoder端词向量的表示上，通过越来越复杂的表示形式，使得纠正效果得到提升。本文分析了现有的神经机器翻译方法中seq2seq模型在语法错误纠正上的缺陷，并根据语法纠错的特点对seq2seq模型进行改进，使其更加适用于错误纠正任务。seq2seq模型主要强调全局的信息，尽管加上soft attention也只是尽可能多的保留全局中的重要信息，而在纠错任务中，单词本身也包含了时态、形态、错误类型等信息，因此本文提出加上hard attention来强化单词自身携带的信息，以及在decoder端多加入一层网络用来提取从全局中捕获到的错误类型信息或者更精细的局部信息，将这两者结合后再进行错误纠正。本文语法错误检测任务在FCE-public测试集上的F0.5值为46.67%，语法错误纠正任务在CONLL-2013测试集上达到30.27%的F0.5值，在同样规模的训练语料下比基于神经机器翻译的模型提高了1.82%。﹀
外文摘要：	︿ At present, English is one of the most popular languages in the world, and the number of English learners is increasing day by day. With the development and popularization of the Internet technology and knowledge sharing, students do not just study at school. The systems that can automatically detect and correct grammatical errors can effectively help students improve their learning efficiency and cultivate the ability to study independently. At the same time, they also reduce the burden of teachers. In China, Juku can score a student’s writings and detect errors based on their corpora and extracted grammatical patterns. However , it nearly does not offer suggestions on how to correct mistakes in grammar. The English grammar correction system can directly correct errors and make up for this deficiency. Before the breakthrough of deep learning, a great deal of researches focus on the adoption of traditional machine learning methods and statistical machine translation methods to solve this problem. As the neural network model of machine translation in the translation task achieve great success, it also provide opportunities for the grammar errors correction area. Most researchers are inclined to focus on the encoder end and the representation of Word Embedding to make corrections more effectively through increasing complexity of algorithm form. This paper analyses the weakness of the neural machine translation model on the grammar errors, and modifies the seq2seq framework to make it more suitable for error correction tasks. Although with soft attention, Seq2seq just keep the important information in the global. But a word may carry tense, form, type and other information. So in this paper, I come up with a new method to combine hard attention with seq2seq framework to strengthening the word carries information. In addition, adding a layer of network on the decoder side to capture the extraction from the global error type information or more detailed local information works. In this paper, the error detection task's F0.5 is 46.67% on FCE-public test. In error correction task, F0.5 is 30.27% on CONLL-2013 test. Under the same training corpus, it increased by 1.82% than neural machine translation model. ﹀
分类号：	TP3
论文总页数：	67
参考文献总数：	33
参考文献列表：	︿ Brockett C, Dolan W B, Gamon M. 2006. Correcting ESL errors using phrasal SMT techniques[C]// Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 249-256. Buck C, Heafield K, Van Ooyen B. 2014. N-gram Counts and Language Models from the Common Crawl[C]//LREC. Dahlmeier D, Ng H T. 2012. Better evaluation for grammatical error correction[C]//Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 568-572. Dahlmeier D, Ng H T, Wu S M. 2013. Building a large annotated corpus of learner english: The nus corpus of learner english[C]//Proceedings of the eighth workshop on innovative use of NLP for building educational applications. 22-31. Dyer C, Chahuneau V, Smith N A. 2013. A simple, fast, and effective reparameterization of ibm model 2[C]//Association for Computational Linguistics. Felice M, Yuan Z. 2014. Generating artificial errors for grammatical error correction[C]//Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics. 116-126. Felice M, Yuan Z, Andersen ? E, et al. 2014. Grammatical error correction using hybrid systems and type filtering[C]//Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. 15-24. Graddol D. 2006. English next [M]//London: British council. Heafield K, Lavie A. 2010. Combining machine translation output with open source: The Carnegie Mellon multi-engine machine translation scheme[J]. The Prague Bulletin of Mathematical Linguistics, 93: 27-36. Hochreiter S, Schmidhuber J. 1997. Long short-term memory[J]. Neural computation, 9(8): 1735-1780. Ji J, Wang Q, Toutanova K, et al. 2017. A nested attention neural hybrid model for grammatical error correction[J]. arXiv preprint arXiv:1707.02026. Junczys-Dowmunt M, Grundkiewicz R. 2016. Phrase-based machine translation is state-of-the-art for automatic grammatical error correction[C]//EMNLP. Kao T H, Chang Y W, Chiu H W, et al. 2013. Conll-2013 shared task: Grammatical error correction nthu system description[C]//Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task. 20-25. Luong M T, Pham H, Manning C D. 2015. Effective approaches to attention-based neural machine translation[J]. arXiv preprint arXiv:1508.04025. Mizumoto T, Hayashibe Y, Komachi M, et al. 2012. The effect of learner corpus size in grammatical error correction of ESL writings[C]//In 24th International Conference on Computational Linguistics, 863-872. Nicholls D. 2003. The Cambridge Learner Corpus: Error coding and analysis for lexicography and ELT[C]//Proceedings of the Corpus Linguistics 2003 conference. 16: 572-581. Ng H T, Wu S M, Wu Y. Ch. Hadiwinoto, and J. Tetreault. 2013. The CoNLL-2013 shared task on grammatical error correction[C]//Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task. 1-12. Ng H T, Wu S M, Briscoe T, et al. 2014. The CoNLL-2014 shared task on grammatical error correction[C]//Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. 1-14. Olah C. 2015. Understanding LSTM networks [EB/OL]. http://colah.github.io/posts/2015-08-Understanding-LSTMs/ Rei M, Yannakoudakis H. 2016. Compositional sequence labeling models for error detection in learner writing[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Rozovskaya A, Dan R. 2016. Grammatical Error Correction: Machine Translation and Classifiers[C]//Meeting of the Association for Computational Linguistics. 2205-2215. Schuster M, Paliwal K K. 1997. Bidirectional recurrent neural networks[M]//IEEE Press. Sidorov G, Gupta A, Tozer M, et al. 2013. Rule-based System for Automatic Grammar Correction Using Syntactic N-grams for English Language Learning (L2)[J]. Conll, 96-101. Sun C, Jin X, Lin L, et al. 2015. Convolutional Neural Networks for Correcting English Article Errors[C]//In Natural Language Processing and Chinese Computing. Springer-Verlag New York, 102-110. Susanto R H, Phandi P, Ng H T. 2014. System Combination for Grammatical Error Correction[C]//Conference on Empirical Methods in Natural Language Processing. 951-962. Tajiri T, Komachi M, Matsumoto Y. 2012. Tense and aspect error correction for ESL learners using global context[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Association for Computational Linguistics, 198-202. Wang C, Li R B, Lin H. 2017. Deep Context Model for Grammatical Error Correction[C]//Proc. 7th ISCA Workshop on Speech and Language Technology in Education. 167-171. Xie Z, Avati A, Arivazhagan N, et al. 2016. Neural language correction with character-based attention[J]. arXiv preprint arXiv:1603.09727. Yannakoudakis H, Briscoe T, Medlock B. 2011. A new dataset and method for automatically grading ESOL texts[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 2011: 180-189. Yannakoudakis H, Rei M, Andersen ? E, et al. 2017. Neural Sequence-Labelling Models for Grammatical Error Correction[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2795-2806. Yoshimoto I, Kose T, Mitsuzawa K, et al. 2013. NAIST at 2013 CoNLL grammatical error correction shared task[C]//Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task. 26-33. Yuan Z, Briscoe T. 2016. Grammatical error correction using neural machine translation[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 380-386. Yuan Z, Felice M. 2013. Constrained grammatical error correction using statistical machine translation[C]//Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task. 52-61. ﹀
馆藏号：	017/M2018(409)
公开日期：	2018-05-26

基于深度学习的英语口语发音评测系统的设计与实现.吴琼

链接

题名：	基于深度学习的英语口语发音评测系统的设计与实现
作者：	吴琼
学号：	1501210730
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	2年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-26
题目(外文)：	Design and Implementation of English Oral Pronunciation Evaluation System Based on Deep Learning
关键字(中文)：	语音合成自动语音识别深度学习发音错误检测 BLSTM-CTC模型
关键字(外文)：	Speech synthesis Automatic speech recognition Deep learning Mispronunciation detection BLSTM-CTC model
文摘：	︿据维基百科数据统计，在中国内陆约3亿人是英语学习者，超过1000万人用英语日常交流。传统的“一对一”教学方式已经满足不了人们日益增长的英语学习需求。随着科技的发展，新型在线教育模式正在打破传统的局限。近年来，深度学习快速发展，自动语音识别技术也有了明显进步。然而，主流的发音错误检测手段需要预标注错误集，依赖的语音识别模型采用自然音进行训练，这种方法劳动力成本巨大，且母语者与非母语者发音音素的概率分布不同，本质上不适合有错误的非母语学习者进行语音序列识别。本文从减少人工标注、补充自然音训练集出发，进行基于语音合成与语音识别的发音评测问题研究，展开了如下工作：（1）采集VCTK母语者音频语料库和CMUDict发音词典，使用g2p-seq2seq框架生成标准音素，接入讯飞、谷歌和微软必应语音开放平台，合成标准音频；（2）设计BLSTM-CTC模型，对比多种合成音与自然音融合方案，实现了基于seq2seq的音素识别，对合成音与自然音的错误音素进行深入分析，并引入NGram语言模型实现了字符识别；（3）采集学生音频，基于音素预测序列并辅以字符预测序列获得发音打分，实现了基于LCS、CMUDict、DFS和路径选择的纠错反馈，分析了发音评测方案在学生音频上的检测效果；（4）基于Flask网页开发技术，实现了前后端交互的发音评测应用系统。研究表明，引入合成音后，（1）对于语音识别，真人音频测试集的PER、CER分别为32.6%和34.3%，在所有模型中达到最优，证实了使用合成音进行语音识别不会降低自然音的识别效果，还能弥补语料不足；（2）对于学生语音识别和纠错，PER和CER有小幅波动，得到错误音素与错误位置的F1值分别为73.58%和72.53%，表明基于合成音的语音识别与纠错方案有效；（3）对于发音评分，平均得分总体小幅下降，在学生发音评测上更加严格，区分学生等级差异表现更优，表明通过合成音来解决发音评分问题具有实际应用价值。﹀
文摘（外文）：	︿ According to Wikipedia statistics, as of 2017, about 300 million people in China are English learners, and more than 10 million people use English everyday. The traditional "one-one" teaching method has not met people's growing demand for English learning. With the development of science and technology, the new online education model is breaking the traditional limitations. In recent years, deep learning has grown rapidly, and there has been significant improvements in Automatic Speech Recognition. However, the mainstream mispronunciation detection methods need to pre-mark error sets, and the dependent speech recognition technology uses natural speeches for training. This method has a huge labor cost, and as the probability distribution of phonetic speeches of native speakers and non-native speakers is different, it is essentially unsuitable to perform speech sequence recognition for non-native speeches. In this paper, for the purpose of reducing manual annotation and supplementing the natural tones training set, a new mispronunciation detection method based on speech synthesis and speech recognition is studied. The following work is done: (1) Acquire VCTK native speakers’ speeches and CMU pronunciation dictionary, use g2p-seq2seq framework to generate standard phonemes, and access Iflytek, Google and Microsoft Bing open platforms to synthesize standard speeches; (2) Design BLSTM-CTC model, compare various synthetic and natural speeches fusion schemes to achieve phoneme recognition based on seq2seq. Analyse erroneous phonemes of synthesized and natural speeches in depth, and the introduce NGram language model to achieve character recognition; (3) Collect students’ audios, obtain pronunciation scores based on predicted phoneme and character sequences, and then achieve error correction feedback using LCS, CMUDict, DFS and path selection. Analyse the detection effect on students' audios. (4) Based on the Flask webpage development technology, the pronunciation evaluation application system for front-end and back-end interaction is implemented. The research shows that after introducing synthetic speeches, (1) for ASR task, PER and CER natural test set are 32.6% and 34.3%, which are the best results of all experiments, confirming that synthetic speeches not only preserve the natural speeches recognition effect, but also compensate for the lack of speech corpus; (2) for students’ speech recognition and mispronunciation error correction tasks, PER and CER fluctuate slightly, and the F1 values of error phonemes and error positions are 73.58% and 72.53%, showing that our proposed synthetic speech-based mispronunciation correcting scheme is effective; (3) for pronunciation scoring task, the average score slightly decreases, synthetic speeches perform more strictly and are superior to natural ones in distinguishing students’ grade differences, indicating that synthetic speeches have practical application values to solve pronunciation scoring problems. ﹀
分类号：	TP3
论文总页数：	77
参考文献数：	34
参考文献：	︿ [1] Engwall O. Analysis of and feedback on phonetic features in pronunciation training with a virtual teacher[J]. Computer Assisted Language Learning, 2012, 25(1):37-64. [2] 胡文凭. 基于深层神经网络的口语发音检测与错误分析[D]. 中国科学技术大学, 2016. [3] Witt S M, Young S J. Phone-level pronunciation scoring and assessment for interactive language learning[J]. Speech Communication, 2000, 30(2):95-108. [4] Witt-Ehsani S. Automatic Error Detection in Pronunciation Training: Where we are and where we need to go[C]// International Symposium on automatic detection on errors in pronunciation training. 2012. [5] Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups[J]. IEEE Signal Processing Magazine, 2012, 29(6): 82-97. [6] Huang X, Ariki Y, Jack M. Hidden Markov Models for Speech Recognition[M]. Columbia University Press, 1990. [7] Benaroya L, Bimbot F. Wiener based source separation with HMM/GMM using a single sensor[J]. Proc Ica, 2003. [8] Dong Y, Seide F, Gang L. Conversational speech transcription using context-dependent deep neural networks[C]// International Coference on International Conference on Machine Learning. Omnipress, 2012:1-2. [9] Abdel-Hamid O, Mohamed A R, Jiang H, et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2012:4277-4280. [10] Graves A, Gomez F. Connectionist temporal classification:labelling unsegmented sequence data with recurrent neural networks[C]// International Conference on Machine Learning. ACM, 2006:369-376. [11] Oord A V D, Dieleman S, Zen H, et al. WaveNet: A Generative Model for Raw Audio[J]. 2016. [12] Wang Y, Skerryryan R J, Stanton D, et al. Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model[J]. 2017. [13] Arik S O, Chrzanowski M, Coates A, et al. Deep Voice: Real-time Neural Text-to-Speech[J]. 2017. [14] Arik S, Diamos G, Gibiansky A, et al. Deep Voice 2: Multi-Speaker Neural Text-to-Speech[J]. 2017. [15] Ping W, Peng K, Gibiansky A, et al. Deep Voice 3: 2000-Speaker Neural Text-to-Speech[J]. 2017. [16] 赖福吉. 语音学教程, 第5版[M]. 北京大学出版社, 2011. [17] Faris M M, Best C T, Tyler M D. An examination of the different ways that non-native phones may be perceptually assimilated as uncategorized.[J]. Journal of the Acoustical Society of America, 2016, 139(1):EL1. [18] Navratil J. Spoken language recognition-a step toward multilinguality in speech processing[J]. Speech & Audio Processing IEEE Transactions on, 2001, 9(6):678-685. [19] Tiwari V. MFCC and its applications in speaker recognition[J]. International journal on emerging technologies, 2010, 1(1): 19-22. [20] Tong R, Lim B P, Chen N F, et al. Subspace Gaussian mixture model for computer-assisted language learning[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2014:5347-5351. [21] Yu Z, Ramanarayanan V, Suendermann-Oeft D, et al. Using bidirectional lstm recurrent neural networks to learn high-level abstractions of sequential features for automated scoring of non-native spontaneous speech[C]// Automatic Speech Recognition and Understanding. IEEE, 2016:338-345. [22] Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model.[M]// Innovations in Machine Learning. Springer Berlin Heidelberg, 2006:137-186. [23] Chen N F, Wee D, Tong R, et al. Large-Scale Characterization of Non-Native Mandarin Chinese Spoken by Speakers of European Origin: Analysis on iCALL[J]. Speech Communication, 2016, 84:46-56. [24] Lee A, Glass J. Mispronunciation detection without nonnative training data[C]//Sixteenth Annual Conference of the International Speech Communication Association. 2015. [25] Truong K. Automatic pronunciation error detection in Dutch as a second language: an acoustic-phonetic approach[J]. 2004. [26] Strik H, Truong K, Wet F D, et al. Comparing Different Approaches For Automatic Pronunciation Error Detection[J]. Speech Communication, 2009, 51(10):845-852. [27] Amdal I, Johnsen M H, Versvik E. Automatic evaluation of quantity contrast in non-native Norwegian speech[J]. 2013. [28] Tepperman J, Narayanan S. Using Articulatory Representations to Detect Segmental Errors in Nonnative Pronunciation[J]. IEEE Transactions on Audio Speech & Language Processing, 2008, 16(1):8-22. [29] Zhang Y, Chen G, Yu D, et al. Highway long short-term memory RNNS for distant speech recognition[J]. Computer Science, 2015:5755-5759. [30] Kim J, Elkhamy M, Lee J. Residual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition[J]. 2017. [31] Amodei D, Anubhai R, Battenberg E, et al. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin[J]. Computer Science, 2015. [32] Miao Y, Gowayyed M, Metze F. EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding[C]// Automatic Speech Recognition and Understanding. IEEE, 2016:167-174. [33] Yao K, Zweig G. Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme Conversion[J]. Computer Science, 2015. [34] Graves A, Ndez S, Schmidhuber J, et al. Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition[C]// Artificial Neural Networks: Formal MODELS and Their Applications - ICANN 2005, International Conference, Warsaw, Poland, September 11-15, 2005, Proceedings. DBLP, 2005:799-804. ﹀
馆藏号：	017/M2018(413)
公开日期：	2020-05-26

面向英语智能学习的知识库系统的设计与实现.梁彪

链接

题名：	面向英语智能学习的知识库系统的设计与实现
作者：	梁彪
学号：	1501210601
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-26
题目(外文)：	Design and Implementation of Knowledge Base System for English Intelligent Learning
关键字(中文)：	语块提取句型提取机器学习句法分析系统设计与实现
关键字(外文)：	Chunk Extraction Sentence Pattern Extraction Machine Learning Syntax Analysis System Design and Implementation
文摘：	︿英语写作作为英语交流的重要方式，在各项英语考试中占据不可忽视的分值。但是国内的学生在英语学习的过程中，由于基础知识储备不足，缺乏写作素材，使得在写作过程中只能单纯的依靠单词的拼凑，无法准确流畅的表达出自己的思想感情，最终导致英文写作成为学生英语能力的短板。笔者对国内英语写作的现状进行调研，证实了学习英语应该从语块和句型入手更为合理。在进行语块和句型的学习之后，结合对基础词汇的理解和掌握，进行合理的搭配使用，从而为学习更加复杂的结构和句子提供基础的语言素材,同时还能够从整块整句中反复学习练习，培养英语用语的思维逻辑，使得英文写作能力得以提高。本文对国内的英语学习者，以考研学生为代表，重点分析和调查了他们在英文写作过程中的需求和遇到的问题，基于语言学习和英文写作教学法，结合语块和句型教学的特点，研发出一款英文写作写前准备阶段面向智能语言学习的知识库系统，该知识库系统的设计主要包括知识库中的基础语言素材，包括语块和句型，所以本文的主要目标分为基于机器学习的语块提取系统和基于句法分析的句型提取系统，旨在帮助国内学习在学习英文写作时提供更多更丰富的基础素材，便于更加快速规范的写出指定主题的文章。本系统的创新点有三部分：1）基于写作主题的语块和句型的教学素材，相对于单一的词汇学习，更有利于学生对词汇的理解和运用，帮助学生快速完成写作任务。2）语块的提取结合了传统的串频统计、内部结合紧密度、外部边界独立性等影响因素，通过机器学习算法来识别语块，提取结果的F值相对传统串频统计方法的结果提升了11.6%。3）基于句法分析来进行句型提取，利用句法结构树来进行单词的泛化，通过实验证明该方法效果明显，其准确率能达到85%以上。﹀
文摘（外文）：	︿ English writing, as an important method of communication in English, occupies a not-negligible score in English exams. However, in the process of English learning, students in the country have insufficient knowledge of basic knowledge and lack of writing material, which can only rely on the patchwork of words in the writing process, and cannot express their thoughts and feelings accurately and fluently. Finally, it leads to the short writing of English writing. The author investigates the current situation of domestic English writing and confirms that learning English should be more reasonable from the perspective of chunks and patterns. After the study of chunks and sentence patterns, a reasonable combination of basic vocabulary understanding and mastery is used to provide basic language materials for learning more complex structures and sentences, as well as a whole block of sentences. Repeatedly learning exercises to cultivate the thinking logic of English terms, making English writing ability can be improved. This article focuses on the analysis and investigation of the needs and problems encountered by English learners in the country and graduate students. It is based on language learning and English writing teaching methods, combined with chunks and sentence patterns. Characteristics, developed a knowledge base system for the preparatory phase of English writing for intelligent language learning, mainly divided into machine-based chunks extraction system and syntactic analysis-based sentence pattern extraction system, designed to help the domestic learning in learning English Provide more and more rich basic material for writing, so that you can write articles with specified topics more quickly and accurately. The innovation of this system consists of three parts: 1) The teaching materials based on the themes and syntactic patterns of the writing subject are more conducive to students' understanding and use of vocabulary than a single vocabulary learning, and help students complete their writing tasks quickly. 2) The extraction of lexical chunks combines the features of traditional string frequency statistics, internal binding compactness, and external boundary independence. The machine learning algorithm is used to identify lexical chunks, and the F value of the extraction result is increased by 11% relative to baseline. 3) Sentence pattern extraction is performed based on syntactic analysis. The syntactic tree is used to generalize the words. The experiment proves that the method is effective and its F value can reach 0.85 or more. ﹀
分类号：	TP3
论文总页数：	61
参考文献数：	33
参考文献：	︿毕会英, 孙立春, 那茗. 单元主题下的语块教学对提高大学生口语输出能力的实证研究[J]. 大家, 2011(6):188-189. 蔡慧萍, 方琰. 英语写作教学现状调查与分析[J]. 外语与外语教学, 2006(9):21-24. 陈军. 图式法在大学英语写作教学中的应用与研究[J]. 海外英语, 2011(4):16-17. 谌贻荣. 中文术语自动提取技术研究[M]. 邓鹂鸣, 刘红, 陈芃,等. 过程写作法的系统研究及其对大学英语写作教学改革的启示[J]. 外语教学, 2003, 24(6):58-62. 郭娴娉. 英语专业大学生如何注意和提取语块[J]. 解放军外国语学院学报, 2011, 34(2):44-49. 韩晨融. 大学英语读写课程中句型教学的探索与实践--以空乘与礼仪专业为例[J]. 海外英语, 2014(16):66-67. 胡春洞. 英语教学法[M]. 高等教育出版社, 1990. 姜柄圭. 面向机器辅助翻译的汉语语块自动抽取研究[J].中文信息学报 2007 李太志. 词块在外贸英语写作教学中的优势及产出性训练法[J]. 外语界, 2006(1):34-39. 马新志. 写前准备—英语写作过程中不容忽视的环节[J].首都师范大学学报:社会科学版, 2010(S3): 50-54. 莫文海. 儿童母语习得对成人外语学习的启示。新疆石油教育学院学报，2005（3），88-97 聂鑫镒. 初中英语的句型教学[J]. 四川职业技术学院学报，2004(1):100-101 宋玉萍，宋丹大学生英语写作现状分析[J] 2015 孙利敏. 高中英语句型教学现状调查[J]. 2011 孙勇.高中英语写作教学问题及对策[D]。山东师范大学,2013. 王晶. 高中英语过程写作法写前准备阶段的行动研究[D]。东北师范大学,2008。王容芳. 浅谈初中英语句型教学[J]. 科学大众，2008(02):53. 王耀. 基于语块理论的教学模式在高起专英语写作教学中的应用 [D]. 首都师范大学, 2014. 徐昉.中国学习者英语学生词块的使用及发展特征研究[J].中国外语，2012（9）：51-56 于秀莲. 语块教学法与提高英语应用能力的实验研究[J]. 外语界, 2008(3):54-61. 赵以, 蒋联江. 语块教学对提高英语专业学生限时写作能力的实验研究[J]. 内蒙古师范大学学报(教育科学版), 2017, 30(7):127-132. 朱夫斯凯. 自然语言处理综论[M]. 电子工业出版社, 2005. Altenberg B. On the phraseology of spoken English: The evidence of recurrent wordcombinations[M]. na, 1998. JamesR.Nattinger, JeanetteS.DeCarrico. Lexical phrases and language teaching:词汇短语与语言教学[J]. 2000. Martinez R. Optimizing a Lexical Approach to Instructed Second Language Acquisition[J]. System, 2010, 38(2):336-338. Nagao M, Mori S. A New Method of N-gram Statistics for Large Number of n and Automatic Extraction of Words and Phrases from Large Text Data of Japanese[J]. Proc.intern.conf.on Computational Linguistics, 1994, 1:611-615. Pan Q S, Ha V S. Implementing the Lexical Approach: Putting Theory Into Practice[J]. Tesol Journal, 1999, 8(1):40-41. Pawley A, Syder F H. Two puzzles for linguistic theory: Nativelike selection and nativelike fluency[J]. J.c.richards & R.w.schmidt Language & Communication, 1983. Wray A. Formulaic Language and the Lexicon[M]. Cambridge University Press, 2002. Wray A. Formulaic language in learners and native speakers[J]. Language Teaching, 1999, 32(4):213-231. Zhang L, Hu J. Statistical substring reduction in linear time[C]// International Joint Conference on Natural Language Processing. Springer-Verlag, 2004:320-327. Zimmerman C B. Historical trends in second language vocabulary instruction[J]. Second language vocabulary acquisition, 1997: 5-19. ﹀
馆藏号：	017/M2018(425)
公开日期：	2021-05-26

基于深度学习的实体关系抽取的研究.唐弘毅

链接

题名：	基于深度学习的实体关系抽取的研究
姓名：	唐弘毅
学号：	1501210989
论文语种：	chi
专业：	专业学 - 工程 - 软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-26
外文题名：	A Research on Entity Relation Extraction Based on Deep Learning
关键词：	实体关系抽取深度学习依存分析
外文关键词：	Entity relation extraction Deep learning Dependency parse
论文摘要：	︿当今的互联网已蕴含了越来越多的知识，通过这些知识，我们不但可以构建庞大的知识库，还能将之运用于智能问答等诸多领域。因此如何搜集并运用这些知识俨然已经成为十分有意义的课题。我们知道，大部分知识都可由实体之间的关系所表示，因此从文本中挖掘知识的过程在一定程度上可以看作是抽取实体之间关系的过程。关于实体关系抽取的研究由来已久。不过最早的时候，人们仅仅是采用基于规则的方法来解决这一问题，这种方法在大多数情况下耗时耗力，且效果不佳。不过随着统计学习方法的崛起，人们运用机器学习技术在该问题上取得了不小的突破。而随着近些年来深度学习技术的不断发展，RNN、CNN等模型的不断提出，该问题的效果又得到了进一步的改善。本文使用了最新的深度学习技术，提取了词向量、上位词向量、词性以及相对位置四种特征，并分别采用了基于原始文本结构和基于依存分析结构两种策略来解决该问题。其中，前一种策略主要依靠基于RNN的Attention机制来捕获关键词信息，依靠CNN模型来捕获短语搭配信息。而后一种策略则是在依存分析路径上建立了CNN模型，和前一种策略相比虽增加了依存分析的预处理过程，然而由于其输入规模小，在训练速度上有着巨大的优势，而在最终结果上也没有逊色很多。根据这两种方法的异质性，本文还采取了两种集成策略将它们结合起来，以获得了更优的分类效果。在实体关系抽取经典的数据集SemEval-2010 Task8上，本文取得了较为优异的85.2%的F值，由此可见，融合了两种不同结构的策略可以更加有效地解决实体抽取问题。﹀
外文摘要：	︿ Today’s Internet has contained more and more knowledge. By using these knowledge, we can not only construct huge knowledge database, but also improve our intelligent question answering system. Therefore, how to collect and apply these knowledge has become a quite meaningful research field. As we all know, most of the knowledge can be represented by relations between entity pairs. To some extent, mining knowledge from raw texts can sometimes been seen as extracting relations between entity pairs. People has been doing researches on entity relation extraction for many years. However, at the beginning, people tried to solve this problem by rule-based method. This kind of method cost lots of time and human resources. What’s worse, the result of this method are not so good. Fortunately, with the development of statistical learning method, people made a breakthrough in this area by using machine learning technology. And with the increasing popularity of deep learning and more and more novel models like RNN and CNN being proposed, the effect of entity relation extraction model has become better and better. This paper talks about how to use state-of-the-art deep learning technology to solve this problem. Firstly we extract four features, word embedding, hypernym embedding, part of speech and relative position. And then we propose two strategies, based on raw text structure and dependency tree structure respectively. The former strategy uses attention mechanism to capture key words and uses CNN model to extract collocation features. While the latter strategy construct CNN model on the dependency path. Compared with the former strategy, this strategy need to do dependency parsing, but in consideration of the small input scale, it has a huge advantage in training speed, and its effect is almost as good as the former strategy. Due to the heterogeneity of these two strategies, we propose two ensemble algorithms to combine them together in order to get better classification results. By using the classical open data set, SemEval-2010 Task8, we find that our best strategy can reach the F score of 85.2%, which shows that combining strategies based on different structures is a quite effective method for entity relation extraction. ﹀
分类号：	TP3
论文总页数：	56
参考文献总数：	40
参考文献列表：	︿ [1] 邓擘, 樊孝忠, 杨立公. 用语义模式提取实体关系的方法[J]. 计算机工程, 2007, 33(10):212-214. [2] Appelt D E, Hobbs J R, Bear J, et al. SRI International FASTUS system: MUC-6 test results and analysis[C]// Conference on Message Understanding. Association for Computational Linguistics, 1995:237-248. [3] Yangarber R, Grishman R. NYU: Description of the Proteus/PET system as used for MUC-7[C]// Message Understanding Conference. 1998:123-131. [4] 徐健, 张智雄, 吴振新: 实体关系抽取的技术方法综述[J], 现代图书情报技术, 2008 (8): 18-23 [5] Aone C, Ramos-Santacruz M. REES: A Large-Scale Relation and Event Extraction System[J]. Proceedings of Anlpnaacl, 2000. [6] Kambhatla, Nanda. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations[J]. Acldemo Proceedings of the Acl on Interactive Poster & Demonstration Sessions, 2004. [7] Zhou GuoDong, Su Jian, Zhang Jie, and Zhang Min. Exploring various knowledge in relation extraction[C]// ACL 2005, Meeting of the Association for Computational Linguistics, Proceedings of the Conference, 25-30 June 2005, University of Michigan, Usa. DBLP, 2002:419-444. [8] 车万翔, 刘挺, 李生. 实体关系自动抽取[J]. 中文信息学报, 2005, 19(2):1-6. [9] Zelenko D, Aone C, Richardella A. Kernel Methods for Relation Extraction.[J]. Journal of Machine Learning Research, 2003, 3(3):1083-1106. [10] Mengqiu Wang. A re-examination of dependency path kernels for relation extraction[J]. Ijcnlp, 2008. [11] Plank B, Moschitti A. Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction[C]// Meeting of the Association for Computational Linguistics. 2013:1498-1507. [12] 刘绍毓, 李弼程, 郭志刚, 王波, 陈刚. 实体关系抽取研究综述[J]. 信息工程大学学报, 2016, 17 (5):541-546 [13] 林衍凯, 刘知远. 基于深度学习的关系抽取[J/OL]. http://www.cipsc.org.cn/qngw/?p=890. [14] Hendrickx I, Su N K, Kozareva Z, et al. SemEval-2010 task 8: multi-way classification of semantic relations between pairs of nominals[C]// The Workshop on Semantic Evaluations: Recent Achievements and Future Directions. Association for Computational Linguistics, 2009:94-99. [15] Socher R, Huval B, Manning C D, et al. Semantic compositionality through recursive matrix-vector spaces[C]// Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2012:1201-1211. [16] Mo Yu, Matthew Gormley, and Mark Dredze. Factor-based compositional embedding models[C]// 2nd Workshop on Learning Semantics, Montreal, Canada. 2014. [17] Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, Jun Zhao. Relation Classification via Convolutional Deep Neural Network[C]// International Conference on Computational Linguistics, 2014: 2335-2344. [18] Cicero Nogueira dos Santos, Bing Xiang, Bowen Zhou. Classifying Relations by Ranking with Convolutional Neural Networks[J]. Computer Science, 2015, 86(86):132-137. [19] Yan Xu, Lili Mou, Ge Li, Yunchuan Chen, Hao Peng, Zhi Jin. Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths[J]. Computer Science, 2015, 42(1):56-61. [20] Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, Bo Xu. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification[C]// Meeting of the Association for Computational Linguistics. 2016:207-212. [21] Yunlun Yang, Yunhai Tong, Shulei Ma, Zhi-Hong Deng. A Position Encoding Convolutional Neural Network Based on Dependency Tree for Relation Classification[C]// Conference on Empirical Methods in Natural Language Processing. 2016. [22] Lu Yao, Zhang Chunyun, Xu Weiran. Instance-Adaptive Attention Mechanism for Relation Classification[C]// International Conference on Artificial Neural Networks. Springer, Cham, 2017:322-330. [23] Pengda Qin, Weiran Xu, Jun Guo. Designing an Adaptive Attention Mechanism for Relation Classification[C]// International Joint Conference on Neural Networks. IEEE, 2017:4356-4362. [24] Jeffrey L. Elman. Finding Structure in Time[J]. Cognitive Science,1990, 14(2):179 -211. [25] Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, and Jürgen Schmidhuber. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies[M]. A Field Guide to Dynamical Recurrent Neural Networks. New York, NY, USA: IEEE Press, 2001. [26] Sepp Hochreiter and Jürgen Schmidhuber. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8):1735-1780. [27] Fukushima K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position[J]. Biological Cybernetics, 1980, 36(4):193-202. [28] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel. Backpropagation applied to handwritten zip code recognition[J], Neural Computation, 1989, 1(4): 541-551. [29] Yatian Shen, Xuanjing Huang. Attention-Based Convolutional Neural Network for Semantic Relation Extraction[C]// International Conference on Computational Linguistics, 2016:2526-2536. [30] Linlin Wang, Zhu Cao, Gerard de Melo, Zhiyuan Liu. Relation Classification via Multi-Level Attention CNNs[C]// Meeting of the Association for Computational Linguistics. 2016:1298-1307. [31] Jizhao Zhu, Jianzhong Qiao, Xinxiao Dai, Xueqi Cheng. Relation Classification via Target-Concentrated Attention CNNs[C]// International Conference on Neural Information Processing. Springer, Cham, 2017:137-146. [32] Yoshua Bengio, Réjean Ducharme, Pascal Vincent, Christian Jauvin. A Neural Probabilistic Language Model[J]. Journal of Machine Learning Research, 2003, 3:1137-1155. [33] Lesk M. Automatic sense disambiguation using machine readable dictionaries:how to tell a pine cone from an ice cream cone[C]// Acm Special Interest Group for Design of Communication. 1986:24-26. [34] Fellbaum C, Miller G. WordNet: An Electronic Lexical Database[M]. MIT Press, 1998. [35] Kingma D, Ba J. Adam: A Method for Stochastic Optimization[J]. Computer Science, 2014. [36] Razvan C. Bunescu and Raymond J. Mooney. A Shortest Path Dependency Kernel for Relation Extraction[C]// Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2005:724-731. [37] Fundel K, Küffner R, Zimmer R. RelEx--relation extraction using dependency parse trees.[J]. Bioinformatics, 2007, 23(3):365-371. [38] Yun-Nung Chen, Dilek Hakkani-Tur, and Gokan Tur. Deriving local relational surface forms from dependency-based entity embeddings for unsupervised spoken language understanding[C]// Spoken Language Technology Workshop. IEEE, 2015:242-247. [39] Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. Computer Science, 2012, 3(4): 212-223. [40] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1): 1929-1958. ﹀
馆藏号：	017/M2018(441)
公开日期：	2018-05-26

数据驱动的海洋意识评价指标体系的构建与实证研究.王一博

链接

题名：	数据驱动的海洋意识评价指标体系的构建与实证研究
姓名：	王一博
学号：	1601210756
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-26
关键词：	数据驱动指标体系海洋意识实证研究
论文摘要：	︿目前国内外对于评价指标体系的构建尚无统一公认的方法，学术界大多采用多轮专家论证的方法。2016年发布的《国民海洋意识发展指数报告（2016）》即采用定性的方法构建指标体系并对我国31个省份的海洋意识发展水平进行综合评价。然而此项研究尚存在许多不足之处，如：指标体系的建立主观性强、指标中的关键词可扩展性差；部分表达三级指标的词语数量覆盖不够全面；所花费人力物力成本较高；研究周期过长等。在对国内外综合评价指标体系的构建方法进行系统梳理、总结与归纳的基础上，本文提出了一套以多源数据为基础、以海洋意识为主题、基于数据驱动构建评价指标体系的方法，并通过实证研究验证了该方法的有效性。本文的研究重点是如何基于数据驱动的方法将共词分析、聚类分析、社会网络分析等技术应用于评价指标体系构建以及半自动化生成指标体系；如何融入综合评价的理论与方法，并利用指标体系对各评价对象进行结果测算和可视化展示。具体研究工作包括：（1）根据文献、电视新闻、报纸、网页、微博等多种涉海数据，构建了一个海洋词表和海洋词向量模型（Word2Vec），为后续中文分词和关键词的扩展奠定了基础。（2）基于海洋意识主题词的共现关系与聚类分析结果，构建了一个海洋意识评价指标体系，并检验了该指标体系的有效性。（3）测算了全国大陆地区31个省份的海洋意识的综合得分与四个一级指标的得分，所得结果与2016年的排名结果具有很强的相关性。（4）测算了全国各省份的333个地级市的海洋意识综合得分，这是传统方法难以得到的结果。（5）基于上述研究建立了海洋意识数据可视化平台，实现了对31个省份、333个地级市海洋意识得分的对比分析和可视化展示。本文采用数据驱动方法构建了评价指标体系，并对指标体系的有效性进行了验证，可为类似指标体系的构建提供一定参考。﹀
分类号：	TP3
论文总页数：	105
参考文献总数：	86
参考文献列表：	︿ [1]国民海洋意识发展指数课题组. 国民海洋意识发展指数报告(2016)[M].北京:海洋出版社,2016:15-26. [2]岳宝彩.我国国民海洋意识亟待提高[N/OL].中国海洋报,2016-11-7(1910)[2018-04-05].http://epaper.oceanol.com/shtml/zghyb/20161107/63559.shtml. [3]陈艳红.发展海洋文化的关键在于海洋意识教育[J].航海教育研究,2010,27(4):12-15. [4]王华.论公众海洋意识的觉醒[J].科技管理研究,2009,29(8):198-200. [5]高建平.国民海洋意识研究[M].北京:海洋出版社,2017:58-60 [6]冯梁.论21世纪中华民族海洋意识的深刻内涵与地位作用[J].世界经济与政治论坛, 2009,(1):71-79. [7]中国海洋石油报社,中国青年报社.中国青年蓝色国土意识大型读者调查[N/OL].中国青年报,1988 [2018-04-05]. http://www.xinhuanet.com/world/2014-07/29/c_126807866.htm [8]国家海洋局宣传教育中心.中国青年海洋意识大型读者调查[N/OL].中国青年报,2014-04-08[2018-04-05].http://zqb.cyol.com/html/2014-04/08/nw.D110000zgqnb_20140408_1-07.htm. [9]王华.建设海洋强国背景下大学生海洋意识培育研究[D].成都理工大学, 2015. [10]刘佳英,江静瑜,黄硕琳.大学生海洋意识调查分析[J].湛江海洋大学学报(社会科学), 2005,25(5):143-146. [11]王新刚,王丽玲,肖继新,等.大学生海洋意识教育现状调查研究[J].长春教育学院学报, 2012,28(1):114-116. [12]谷方为.初中生海洋意识现状与培养对策研究[D].东北师范大学,2007. [13]何立居.海洋观教程[M].北京:海洋出版社,2009:1-2. [14]赵宗金,尹永超.我国海洋意识的历史变迁和类型分析[J].临沂大学学报,2012,(4):65-69. [15]章其真.传媒与国民海洋意识的普及[J].中国广播电视学刊,2014,(10):39-41. [16]宋伟萍.海洋强国建设视阈下大学生海洋意识培养研究[D].大连海事大学,2015. [17]赵宗金,沈学乾.海洋意识的变迁及其建构研究---基于建构主义的分析视角[J].中国海洋社会学研究,2014. [18]马春华.大力提高全民族海洋意识[N].江苏科技报,2010-10-14(A10). [19]尹永超.试论我国海洋意识体系的构建[C].中国社会学年会暨第二届海洋社会学论坛, 2011. [20]姜秀敏,秦龙,陶婷婷.台湾地区海洋意识培养的借鉴与启示[J].航海教育研究, 2013,30(1):39-42. [21]马得懿.海洋意识的内涵、体系与演化路径串[J].上海行政学院学报,2015,(4):85-94. [22]林超群.我国政府海洋意识变迁研究——基于1954-2013年政府工作报告的文本分析[D]. 中国海洋大学,2014. [23]郭强.高中地理教学中海洋意识教育研究[D].聊城大学,2015. [24]庾婧.青岛市大学生海洋环境意识研究[D].中国海洋大学,2013. [25]李珊,秦龙.中国公众海洋意识体系初探-基于大连716油管爆炸事件网民意见的分析[J]. 大连海事大学学报,2010,09(6):91-95. [26] MCKINLEY E, FLETCHER S. Improving marine environmental health through marine citizenship: A call for debate[J]. Marine Policy, 2012,36(3):839-843. [27] JEFFERSON R L, BAILEY I, LAFFOLEY D D A, et al. Public perceptions of the UK marine environment[J]. Marine Policy, 2014,43:327-337. [28] FLETCHER S, POTTS J S, HEEPS C, et al. Public awareness of marine environmental issues in the UK[J]. Marine Policy, 2009,33(2):370-375. [29] STEEL B, LOVRICH N, LACH D, et al. Correlates and Consequences of Public Knowledge Concerning Ocean Fisheries Management[J]. Coastal Management, 2005,33(1):37-51. [30] BALLANTYNE R. Young students' conceptions of the marine environment and their role in the development of aquaria exhibits[J]. 2004,60(2):159-163. [31] STEEL B, LOVRICH N, LACH D, et al. Correlates and Consequences of Public Knowledge Concerning Ocean Fisheries Management[J]. 2005,33(1):37-51. [32] GELCICH S, BUCKLEY P, PINNEGAR J K, et al. Public awareness, concerns, and priorities about anthropogenic impacts on marine environments[J]. Proceedings of the National Academy of Sciences, 2014,111(42):15042-15047. [33]陈正伟.综合评价技术及应用[M].西南财经大学出版社,2013. [34]杜栋,庞庆华.现代综合评价方法与案例精选[M].北京:清华大学出版社,2005:29-31. [35] C D, K S. Stock market prediction using different neural stock market prediction using different neural network classification architectures: Conference on computational intelligence for classfication atchitectures, 1996[C]. [36] ULENGIN F, ILKER T Y, SAHIN S O. An integrated decision aid system for Bosphorus water-crossing problem[J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2001. [37]郭鑫.政务微博影响力评价指标体系的构建与实证研究[D].北京大学,2017. [38]王称心,蒋立山.现代化法治城市评价——北京市法治建设状况综合评价指标体系研究[M].北京:知识产权出版社,2008:123-125. [39]李远远.基于粗糙集的指标体系构建及综合评价方法研究[D].武汉理工大学,2009. [40]欧阳亚菲.企业电子商务应用水平测度指标体系构建及应用研究[D].南京邮电大学, 2017. [41]侯爱龙.我国法治政府指标体系的构建[D].北京交通大学,2017. [42]姜悦霞.政府数据开放网站绩效评价指标体系及应用研究[D].合肥工业大学,2017. [43]王一博,郭鑫,王继民.基于词共现的大数据研究主题分析[J].图书馆论坛, 2014,(8):96-102. [44]李晗.基于数据驱动的故障诊断方法综述[J].控制与决策,2011,26(1):1-9. [45] SUDHEER K P, GOSAIN A K, RAMASASTRI K S. A data-driven algorithm for constructing artificial neural network rainfall-runoff models[J]. Hydrological Processes, 2002,16(6):1325-1330. [46] TEEGAVARAPU R S V, CHANDRAMOULI V. Improved weighting methods, deterministic and stochastic data-driven models for estimation of missing precipitation records[J]. Journal of Hydrology, 2005,312(1-4):191-206. [47] SATAPATHY S C, BHATEJA V, RAJU K S, et al. Data Engineering and Intelligent Computing[M]. DE: Springer Verlag, 2016. [48] JOHNS T. Should You Be Persuaded-Two Samples of Data-Driven Learning Materials[J]. ELR Journal, 1991,4:1-16. [49]朱慧敏.数据驱动学习:英语词汇教学的新趋势[J].外语电化教学, 2011,(1):46-50. [50]李林容,赵红勋.近10年央视《新闻联播》研究综述[J].中国出版, 2014,(1):38-41. [51]常江. 《新闻联播》简史：中国电视新闻与政治的交互影响(1978-2013)[J]. 国际新闻界, 2014,36(5):120-132. [52]孙海博. 电视新闻语态变化研究[D]. 吉林大学, 2017. [53]搜狐. 数字回顾2017，看看央视的收视成绩单[EB/OL]. (2018-1-11) [2018-3-19]. http://www.sohu.com/a/216029311_651653. [54]杨磊, 孙业. 我国省级党报的现状与走势——全国省级党报基本情况调查报告(上)[J]. 新闻记者, 2001,(8):6-11. [55]LU SCORPIO. 大白话讲解word2vec到底在做些什么[EB/OL]. (2017-3-12) [2018-3-18]. http://blog.csdn.net/mylove0414/article/details/61616617. [56] MA L, ZHANG Y. Using Word2Vec to process big text data, 2015[C]. IEEE. [57] HU J, JIN F, ZHANG G, et al. A User Profile Modeling Method Based on Word2Vec, 2017[C]. IEEE. [58] SWOBODA T, HEMMJE M, DASCALU M, et al. Combining Taxonomies using Word2vec, 2016[C]. ACM. [59]王一博, 俞敬松, 赵常煜. 共词方法在三国人物关系分析中的应用研究[J]. 情报探索, 2017,(7):52-56. [60]朱庆华, 彭希羡, 刘璇. 基于共词分析的社会计算领域的研究主题[J]. 情报理论与实践, 2012,35(12):7-11. [61]崔雷. 专题文献高频主题词的共词聚类分析[J]. 情报理论与实践, 1996,(4):49-51. [62] ZHAO W, MAO J, LU K. Ranking themes on co-word networks: Exploring the relationships among different metrics[J]. Information Processing & Management, 2018,54(2):203-218. [63] YIN R, YI T, MO Y. Hotspot for Study in UML of China: Co-word Analysis[J]. 2009. [64] KATSURAI M. Bursty Research Topic Detection From Scholarly Data Using Dynamic Co-Word Networks: A Preliminary Investigation[J]. 2017. [65] LEYDESDORFF L, NERGHES A. Co-word maps and topic modeling: A comparison using small and medium-sized corpora (N [66] LIU Y, GONCALVES J, FERREIRA D, et al. CHI 1994-2013: mapping two decades of intellectual progress through co-word analysis, 2014[C]. ACM. [67]崔印昌. 基于Spark的社会网络分析系统的设计与实现[D]. 北京邮电大学, 2017. [68] FELMLEE D H. Interaction in Social Networks[M]. New York: Kluwer Academic/Plenum Publishers, 2013. [69]安德烈·姆尔瓦, 沃特·德·诺伊. 蜘蛛:社会网络分析技术[M]. 北京: 世界图书出版公司北京公司, 2012:35-37 [70]林聚任. 社会网络分析：理论、方法与应用[M]. 北京: 北京师范大学出版社, 2009:215-216. [71] CHANG V. A proposed social network analysis platform for big data analytics[J]. Technological Forecasting and Social Change, 2018,130:57-68. [72] MCCURDIE T, SANDERSON P, AITKEN L M. Applying social network analysis to the examination of interruptions in healthcare[J]. Applied Ergonomics, 2018,67:50-60. [73] HAYAT T Z, LESSER O, SAMUEL-AZRAN T. Gendered discourse patterns on online social networks: A social network analysis perspective[J]. Computers in Human Behavior, 2017,77:132-139. [74] KIM J, HASTAK M. Social network analysis: Characteristics of online social networks after a disaster[J]. International Journal of Information Management, 2018,38(1):86-96. [75] 李微, 纪希禹, 韩秋明. 数据挖掘技术应用实例[M]. 北京: 机械工业出版社, 2009:81-82. [76] HAN JIAWEI, KAMBER MICHELINEN. 数据挖掘概念与技术[M]. 北京: 机械工业出版社, 2001:65-66. [77] PENKOVA T G. Principal component analysis and cluster analysis for evaluating the natural and anthropogenic territory safety[J]. Procedia Computer Science, 2017,112:99-108. [78] GHEID Z, CHALLAL Y. Efficient and Privacy-Preserving k-means clustering For Big Data Mining[J]. 2016. [79] PONDE P, SHIRWAIKAR S, GORE S. Hierarchical Cluster Analysis On Security Design Patterns, 2016[C]. ACM. [80] LIU P. Study on the Clustering Analysis Algorithm Application and Recognition Accuracy Simulation in Data Mining[J]. 2017. [81] WIJAYANTI S, AZAHARI, ANDREA R. K-Means Cluster Analysis for Students Graduation (Case Study: STMIK Widya Cipta Dharma[J]. 2017. [82] 王然, 成金华. 中国省域生态文明评价指标体系构建与实证研究[M]. 武汉:中国地质大学出版社, 2017:32-33. [83]殷克东, 方胜民. 海洋强国指标体系[M]. 北京:经济科学出版社, 2008:56-57. [84] 邱均平, 王碧云, 汤建民. 教育评价学理论·方法·实践[M]. 北京:科学出版社, 2016:45-46. [85]吴喜之. 统计学：从数据到结论[M]. 北京:中国统计出版社, 2013:167-168. [86]李建霞. 地区公共图书馆可持续发展能力的因子分析与综合评价[J]. 图书情报工作, 2007,51(4):84-88. ﹀
馆藏号：	017/M2018(510)
公开日期：	2018-05-26

基于深度神经网络的弱监督人脸识别方法研究.于程程

链接

题名：	基于深度神经网络的弱监督人脸识别方法研究
作者：	于程程
学号：	1301211056
语种：	chi
专业：	工学 - 软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-26
题目(外文)：	Research on Weak-supervised Face Recognition Based on Deep Learning
关键字(中文)：	弱监督学习人脸识别深度学习多层感知机卷积神经网络
关键字(外文)：	weak supervision learning face recognition deep learning multi-layer perceptron convolutional neural networks
文摘：	︿摘要目前，整个人脸识别技术具有广泛的应用前景。在国家安全和公共安全领域，人脸识别技术能够有效地预防暴力恐怖事件、群体冲突事件，帮助建设和国家城市建设发展相关的“平安城市”“智慧城市”等项目中。由于人脸识别技术具有重要的应用场景，人脸识别技术发展非常迅速。这一轮技术的爆发很重要的一点是大量人脸数据的积累以及深度学习算法的发展。由于深度学习算法具有非常强大的模型表达能力，当给其提供足够多的训练数据后，其不仅能够有效地记住这些训练数据，还能够有效地从这些训练数据中总结出相关规律，用来推测新的数据。因此“大数据+深度学习”的模式基本成了现在人脸识别系统的标准配置。然而，这样的一种处理方式也实际工作中亦存在一定的问题。例如，用来训练深度学习模型方法的数据通常是需要有标签信息的（也即已经标注了图像数据是谁），然而要对大量的图像数据进行标注是一件非常耗时耗力的事情，需要花费从业者研究员大量的时间以及精力，此外有时候由于标注质量的原因很多数据的标签也含有错误的标记。也就是说，在很多情况下，用来训练模型的有效的标记样例不够，而未标记样例或者含有嘈杂标签的样例非常多，那么能否在这种场景下训练比较有效的人脸识别模型？我们称这样的场景为弱监督人脸识别。本论文主要考虑弱监督人脸识别中没有噪音的标记样例相对较少的场景。该论文针对目前的深度神经网络在只有少量标注数据上模型极易过拟合的问题，从经典的多层感知机（MLP）框架着手，通过结合深度自动编码模型以及深度前向网络来设计同时学习标记样例和未标记样例的深度神经网络模型，以求提高多层感知机在标注数据量较少场景下的分类准确率。此外，针对更复杂的卷积神经网络，该论文也设计了相应的深度模型来提升弱监督场景下的人脸识别效果。该论文具体通过使用反卷积操作，实现了基于卷积网络架构的自编码模型，并且利用自编码器的编码部分与卷积网络共享权重，进而实现了能够同时有效利用大量未标记样例和少量标记样例的深度卷积神经网络的模型。该论文针对这种能够利用标记样例和未标记样例进行学习的深度网络非常难以训练的问题，进一步通过在编码器和解码器间引入基于线性变换的 Skip 连接方法，解决了此难题，大大提升了弱监督环境下的人脸识别效果。该论文在YaleB, PIE，CelabA以及FaceScrub等大规模数据集上进行了实验，实验结果充分说明了所提出的模型和方法的有效性以及实用性。此外，我们也从工程的角度设计和实现了一个人脸识别算法系统，该系统集成了模型的离线训练以及在线测试。﹀
文摘（外文）：	︿ ABSTRACT Face recognition technology has a wide range of application prospects. In the fields of national security and public safety, face recognition technology can effectively prevent violent terrorist incidents and all conflict incidents in advance, and can help promote the construction of “safe cities”. Due to the important application scenarios of face recognition technology, the current face recognition technology is developing very rapidly. One of the important aspects of this round of technology explosion is the accumulation of large amounts of face data and the development of deep learning algorithms. Because the deep learning algorithm has a very powerful model expression capability, when it is provided with enough training data, it can not only effectively remember these training data, but also can very effectively sum up the relevant laws from these training data. Therefore, the "big data + deep learning" model has become the standard of today's face recognition system. However, there is also a certain problem with such a treatment. The data used to train the deep learning model method usually needs to have tag information (that is, who has already marked the image data), but tagging a large amount of image data is a very time-consuming and labor-intensive task. In other words, in many cases, there is not enough data of the marked sample to train the model, and there is a lot of data of the unlabeled sample. Can we train a more effective face recognition model under this scenario? We call this scenario a weak supervision of face recognition. This paper aims at the problem that the current deep neural network can easily over-fit the training data with only a few annotation data. It starts from the classical Multilayer Perceptron (MLP) framework and designs the deep models by combining the deep auto-encoder model and the deep forward network. The proposed model further improves the classification accuracy of the multi-layer perceptron in scenarios where the amount of marked data is small. In addition, for more complex convolutional neural networks, the paper also designed a corresponding deep semi-supervised model to improve the face recognition effect in weakly supervised scenes. This paper uses the auto-encoder based on convolutional network architecture to realize the deconvolution operation, and uses the encoder of the auto-encoder to share the weight with the convolutional network, and then implements a deep semi-supervised learning model based on the convolutional neural network. This paper addresses this problem of deep semi-supervised networks that are difficult to train, and further solves this problem by introducing a Skip connection between the encoder and the decoder, which greatly improves the face recognition performance in a weakly supervised environment. The paper has conducted experiments on large-scale datasets such as YaleB, PIE, CelabA, and FaceScrub. The experimental results fully demonstrate the effectiveness of the proposed model and method. Besides, We also design and implement a demo system for face recognition. ﹀
分类号：	TP3
论文总页数：	81
参考文献数：	45
参考文献：	︿参考文献 [1]. Belhumeur, P.N.; Hespanha, J.P.; Kriegman, D.J. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 711–720. [2]. Hoffmann, H. Kernel PCA for novelty detection. Pattern Recognit. 2007, 40, 863–874. [3]. Lades, M.; Vorbruggen, J.C.; Buhmann, J.; Lange, J.; von der Malsburg, C.; Wurtz, R.P.; Konen, K. Distortion invariant object recognition in the dynamic link architecture. IEEE Trans. Comput. 1993, 42, 300–311. [4]. Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [5]. Nebti, S.; Fadila, B. Combining classifiers for enhanced face recognition. In Advances in Information Science and Computer Engineering; Springer: Dordrecht, The Netherlands, 2015; Volume 82. [6]. Singha, M.; Deb, D.; Roy, S. Hybrid feature extraction method for partial face recognition. Int. J. Emerg. Technol. Adv. Eng. Website 2014, 4, 308–312. [7]. Sompura, M.; Gupta, V. An efficient face recognition with ANN using hybrid feature extraction methods. Int. J. Comput. Appl. 2015, 117, 19–23 [8]. Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, pages 1106–1114, 2012. [9]. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z.,Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. IJCV (2015) [10]. Srivastava, R.K., Gre_, K., Schmidhuber, J.: Training very deep networks. In: NIPS. (2015) [11]. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition.In: CVPR. (2016) [12]. Gary B. Huang. 2012. Learning hierarchical representations for face verification with convolutional deep belief networks. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (CVPR '12). IEEE Computer Society, Washington, DC, USA, 2518-2525. [13]. Y. Taigman, M. Yang, M. Ranzato and L. Wolf, "DeepFace: Closing the Gap to Human-Level Performance in Face Verification," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 1701-1708. [14]. Y. Sun, X. Wang and X. Tang, "Deep Learning Face Representation from Predicting 10,000 Classes," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 1891-1898. [15]. F. Schroff, D. Kalenichenko and J. Philbin, "FaceNet: A unified embedding for face recognition and clustering," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, 2015, pp. 815-823. [16]. Quanxue Gao, Yunfang Huang, Xinbo Gao, Weiguo Shen, Hailin Zhang,A novel semi-supervised learning for face recognition,Neurocomputing,Volume 152,2015,Pages 69-76 [17]. Haitao Gan, Nong Sang, and Rui Huang, "Self-training-based face recognition using semi-supervised linear discriminant analysis and affinity propagation," J. Opt. Soc. Am. A31, 1-6 (2014) [18]. Du W., Inoue K., Urahama K. (2005) Dimensionality Reduction for Semi-supervised Face Recognition. In: Wang L., Jin Y. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2005. Lecture Notes in Computer Science, vol 3614. Springer, Berlin, Heidelberg [19]. Pang Ying Han, Ooi Shih Yin, Goh Fan Ling, "Semi-supervised generic descriptor in face recognition", Signal Processing & Its Applications (CSPA) 2015 IEEE 11th International Colloquium on, pp. 21-25, 2015. [20]. Wang F, Zhang C. Label Propagation through Linear Neighborhoods. IEEE Transactions on Knowl-edge and Data Engineering, 2008, 20(1):55–67. [21]. He R, Zheng W S, Hu B G, et al. Nonnegative sparse coding for discriminative semi-supervised learning. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011. 2849–2856. [22]. Zhuang L, Gao H, Lin Z, et al. Non-negative low rank and sparse graph for semi-supervised learning.IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012. 2328–2335. [23]. Yichuan Tang, Ruslan Salakhutdinov, Geoffrey E. Hinton, Deep lambertian networks. Proceedings of the 29th International Conference on Machine Learning (ICML 2012), 2012. [24]. S. Gao, Y. Zhang, K. Jia, J. Lu and Y. Zhang, "Single Sample Face Recognition via Learning Deep Supervised Autoencoders," in IEEE Transactions on Information Forensics and Security, vol. 10, no. 10, pp. 2108-2118, Oct. 2015. [25]. T. H. Chan, K. Jia, S. Gao, J. Lu, Z. Zeng and Y. Ma, "PCANet: A Simple Deep Learning Baseline for Image Classification?," in IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5017-5032, Dec. 2015. [26]. Yao Sun, Lejian Ren, Zhen Wei, Bin Liu, Yanlong Zhai, and Si Liu. 2017. A weakly supervised method for makeup-invariant face verification. Pattern Recogn. 66, C (June 2017), 153-159. DOI: https://doi.org/10.1016/j.patcog.2017.01.011 [27]. Chen, Binghui Deng, Weihong. Weakly-supervised deep self-learning for face recognition. ICME, 2016. [28]. Noroozi, Vahid, Lei Zheng, Sara Bahaadini, Sihong Xie and Philip S. Yu. “SEVEN: Deep Semi-supervised Verification Networks.” IJCAI (2017). [29]. Blum A and Mitchell T. Combining Labeled and Unlabeled Data with Co-training [A]. Proceedings of the 11th Annual Conference on Computational Learning Theory [C]. New York, USA: ACM press,1998: 92-100. [30]. Nigam K，Ghani R．Analyzing the effectiveness and applicability of co-training [C]．Proceedings of the 9th ACM International Conference on Information and Knowledge Management(CIKM)，McLean，VA，2000． [31]. Blum A and Chawla S. Learning from Labeled and Unlabeled Data using Graph Mincuts [A].Proceedings of the 18th International Conference on Machine Learning [C]. Francisco, CA: Morgan Kaufmann, 2001: 19-26. [32]. Zhu X J, Lafferty J, and Ghahramani Z. Semi-supervised Learning using Gaussian Fields and Harmonic Functions [A]. Proceedings of the 20 th International Conference on Machine Learning [C]. 2003: 912-919. [33]. V. Castelli. T. Cover. The exponential value of labeled samples. Pattern Recognition Letters. 1995, (16): 105–111. [34]. G.Haffari and A. Sarkar. analysis of semi-supervised learning with the yarowsky algorithm. UAI, page 159-166 ,2007. [35]. Zhou D Y, Bousquet O, Lal T, et al. Learning with Local and Global Consistency [A]. Advances in Neural Information Processing Systems 16 [C]. 2004, 16: 321-328. [36]. Rasmus A, Valpola H, Honkala M, et al. Semi-supervised Learning with Ladder Networks. Proceed-ings of the 28th International Conference on Neural Information Processing Systems, 2015. 3546–3554. [37]. Vincent P, Larochelle H, Lajoie I, et al. Stacked Denoising Autoencoders: Learning Useful Represen-tations in a Deep Network with a Local Denoising Criterion. J. Mach. Learn. Res., 2010, 11:3371–3408. [38]. Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. ICML, 2015. [39]. Georghiades A, Belhumeur P, Kriegman D. From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose. IEEE Trans. Pattern Anal. Mach. Intelligence, 2001, 23(6):643–660. [40]. Sim T, Baker S, Bsat M. The CMU Pose, Illumination, and Expression (PIE) Database. Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, 2002. [41]. S. Yang, P. Luo, C. C. Loy, and X. Tang, "From Facial Parts Responses to Face Detection: A Deep Learning Approach", in IEEE International Conference on Computer Vision (ICCV), 2015 [42]. H.-W. Ng, S. Winkler. A data-driven approach to cleaning large face datasets.Proc. IEEE International Conference on Image Processing (ICIP), Paris, France, Oct. 27-30, 2014. [43]. LeCun Y, Bottou L, Orr G B, et al. Effiicient BackProp. Neural Networks: Tricks of the Trade, 1998. [44]. Kingma D P, Ba J. Adam: A Method for Stochastic Optimization. ICLR, 2015. [45]. Hyeonwoo Noh, Seunghoon Hong, Bohyung Han，Learning Deconvolution Network for Semantic Segmentation, ICCV 2015。﹀
馆藏号：	017/M2018(01)
公开日期：	2021-05-26

基于paraphrase generation的英语作文辅导功能的后端设计和实现.万泽宇

链接

题名：	基于paraphrase generation的英语作文辅导功能的后端设计和实现
姓名：	万泽宇
学号：	1501210694
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	1年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-26
外文题名：	The implementation of a backened system for teaching writing compositions based on paraphrase generation technique
关键词：	转写生成英语作文后端设计与实现
外文关键词：	paraphrase generation readability improvement the design and implementa- tion of backend system
论文摘要：	︿本篇论文是基于转写生成 (paraphrase generation) 技术的英语作文辅导功能的后端系统设计与实现，主要功能是基于转写生成技术对学生写过的句子进行扩展。过程写作理论说明，增大学生的语言输入，能增强学生在写作阶段的写作水平，所以本文尝试以句子扩展的形式增大学生的输入。在扩展句子方面，世面上的英语作文辅导系统的扩展功能大多是基于数据库查找关键词来进行扩展，这样获得的句子，相关性不强，且过于依赖数据库的大小。而使用转写生成技术进行的句子扩展无需进行关键字查找，对新数据处理能力较好，得到的结果相关性较强，能给学生带来更多的启发。本文的系统设计与实现分为两部分，分别是转写算法应用系统的研究和提供配套服务的后台系统的实现。算法应用系统主要提供转写生成和转写的难度控制的功能。而后台应用系统主要提供一系列的后台服务，包括对外暴露 API 并管理用户，并异步调用算法应用系统。在算法实现方面，本文在转写生成上，使用了本是用于缩写的模型结构 [25] ，以满足实际应用的需求，并加入残差网络机制 [23] 以提高效果。在难度控制方面，本文使用了 Simple PPDB 语料库，根据 SyntaxNet 给出的词性标注进行替换，提高了可读性，控制了生成的转写的难度。在后台系统设计方面，本文基于 Flask 进行了后台框架设计，设计了用户鉴权和数据持久化的方案。同时在实际应用算法方面，对算法模块进行了异步调用，提高了系统效率。在实验部分，本文测试了几种改进模型，并使用基于词嵌入的评价标准进行了评测，结果表明新的改进的方法达到了目前的较高水平。由于转写和机器翻译的不同，本文还设计了语义区分度和句法相似度两个指标进行评测，在这两个指标上本文的改进算法优势较大。为了弥补句法相似度指标中自动句法分析可能出现的错误，本文使用了众包评测的形式，对文中提出的改进方法生成的转写进行了评测，结果显示，文中提出的转写生成方法在实际应用中有一定优势，生成的转写在语义上和原始文本更为接近，在句法结构上和原始文本产生了较大区别，符合对于好的转写的定义。在对生成的转写的难度控制上，成功在一定程度上控制了转写的难度。英语写作能力的提高本就是一个复杂的过程，本篇论文使用转写生成技术扩展句子，从而扩展用户思维，获得了一些初步成果，但仍然需要进行完善和尝试。﹀
外文摘要：	︿ This article is about the implementation of a backend system for teaching writing com-positions based on paraphrase generation technique. The main function for this system is to extend the sentences which studenets write by using paraphrase generation technique. The theory of writing as a process indicates that increasing the language input to students can improve students’ writing ability. Most composite writing aiding systems use keywords to search for relevant sentences which can be irrelevant to the original sentence in meaning.Using paraphrase generation technique to generate relevant sentences can overcome this shortcoming. The design and implementation of the system focus on two parts.First part is about the paraphrase generation algorithm.The second part is about the engineering of backend system which provides Rest service. The algorithm part is to generate paraphrases according to student’s writing. The author added residual trick to deep neural model called pointer generator to better generate meaningful paraphrases. After generation, the system will provide di?culty control function by using Simple PPDB database. As for the engineering of backend system,the system is based on Flask and uses Celery and Redis to provide async algorithm function. After automatic evaluation experiments and crowd-sourcing test, the paraphrase genera-tion algorithm in this article generates high quality paraphrases which are more relavent to theoriginal sentence in meaning and are drastically di?erent in sentence structure. ﹀
分类号：	TP3
论文总页数：	99
参考文献总数：	34
参考文献列表：	︿ [1] Emil Abrahamsson, Timothy Forni, Maria Skeppstedt et al. Medical text simpli?cation using synonym replacement: Adapting assessment of word di?culty to a compounding language. In: Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), 2014: 57–65. [2] Konstantin Aksyonov, Eugene Bykov and Olga Aksyonova. Real time simulation models integrated into the corporate information systems. In: Control Conference (CCC), 2014 33rd Chinese. IEEE, 2014: 6810–6813. [3] Abdel Karim Al Tamimi, Manar Jaradat, Nuha Al-Jarrah et al. AARI: automatic arabic readability index. Int. Arab J. Inf. Technol. 2014, 11(4): 370–378. [4] Dzmitry Bahdanau, Kyunghyun Cho and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014. [5] Samuel R. Bowman, Luke Vilnis, Oriol Vinyals et al. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349, 2015. [6] Ziqiang Cao, Chuwei Luo, Wenjie Li et al. Joint Copying and Restricted Generation for Paraphrase. In: AAAI, 2017: 3152–3158. [7] David L. Chen and William B. Dolan. Collecting Highly Parallel Data for Paraphrase Evaluation. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1. Stroudsburg, PA, USA: Association for Computational Linguistics, 2011: 190–200. http://dl.acm.org/citation.cfm?id=2002472.2002497,retrieved on 2018-03-19. [8] Cedric De Boom, Steven Van Canneyt, Steven Bohez et al. Learning semantic similarity for very short texts. In: Data Mining Workshop (ICDMW), 2015 IEEE International Conference on. IEEE,2015: 1229–1234. [9] Roy T. Fielding and Richard N. Taylor. Architectural styles and the design of network-based software architectures. University of California, Irvine Doctoral dissertation, 2000. [10] Kar?n Fort, Gilles Adda, Beno?t Sagot et al. Crowdsourcing for language resource development: crit-icisms about amazon mechanical turk overpowering use. In: Language and Technology Conference.Springer, 2011: 303–314. [11] Jiatao Gu, Zhengdong Lu, Hang Li et al. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. arXiv:1603.06393 [cs], 2016-03. http://arxiv.org/abs/1603.06393, retrieved on 2018-01-15. [12] Ankush Gupta, Arvind Agarwal, Prawaan Singh et al. A Deep Generative Framework for Paraphrase Generation. arXiv:1709.05074 [cs], 2017-09. http://arxiv.org/abs/1709.05074, retrieved on 2018-01-09. [13] Florian Haupt, Dimka Karastoyanova, Frank Leymann et al. A model-driven approach for REST compliant services. In: Web Services (ICWS), 2014 IEEE International Conference on. IEEE, 2014: 129–136. [14] Kaiming He and Jian Sun. Convolutional neural networks at constrained time cost. In: Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on. IEEE, 2015: 5353–5360. [15] Thorsten Joachims. A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Catego-rization. [techreport], 1996. [16] Matt Kusner, Yu Sun, Nicholas Kolkin et al. From word embeddings to document distances. In:International Conference on Machine Learning, 2015: 957–966. [17] Quoc Le and Tomas Mikolov. Distributed representations of sentences and documents. In: Interna-tional Conference on Machine Learning, 2014: 1188–1196. [18] Chang Liu, Daniel Dahlmeier and Hwee Tou Ng. PEM: A paraphrase evaluation metric exploiting parallel texts. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2010: 923–932. [19] Tomas Mikolov, Quoc V. Le and Ilya Sutskever. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168, 2013. [20] R. L. Newman, A. Clark, C. M. Trabant et al. Wilber 3: A Python-Django Web Application For Acquiring Large-scale Event-oriented Seismic Data. In: AGU Fall Meeting Abstracts, 2013. [21] Kishore Papineni, Salim Roukos, Todd Ward et al. BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002: 311–318. [22] Ellie Pavlick and Chris Callison-Burch. Simple PPDB: A paraphrase database for simpli?cation. In:Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume2: Short Papers), 2016: 143–148. [23] Aaditya Prakash, Sadid A. Hasan, Kathy Lee et al. Neural Paraphrase Generation with Stacked Residual LSTM Networks. arXiv:1610.03098 [cs], 2016-10. http://arxiv.org/abs/1610.03098, retrieved on 2018-01-09. [24] Yossi Rubner, Carlo Tomasi and Leonidas J. Guibas. The earth mover’s distance as a metric for image retrieval. International journal of computer vision, 2000, 40(2): 99–121. [25] Abigail See, Peter J. Liu and Christopher D. Manning. Get To The Point: Summarization with Pointer-Generator Networks. arXiv:1704.04368 [cs], 2017-04. http://arxiv.org/abs/1704.04368,retrieved on 2018-01-09. [26] Luo Si and Jamie Callan. A statistical model for scienti?c readability. In: Proceedings of the tenth international conference on Information and knowledge management. ACM, 2001: 574–576. [27] Ilya Sutskever, Oriol Vinyals and Quoc V Le; ed. by Z. Ghahramani, M. Welling, C. Cortes et al. Sequence to Sequence Learning with Neural Networks. In: Advances in Neural Information Processing Systems 27. Curran Associates, Inc., 2014: 3104–3112. http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf, retrieved on 2018-01-09. [28] Nguyen Minh Tien and Cyril Labbé. Detecting Automatically Generated Sentences with Grammatical Structure Similarity. [29] Chris Van Pelt and Alex Sorokin. Designing a Scalable Crowdsourcing Platform. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. New York, NY, USA: ACM, 2012: 765–766. http://doi.acm.org/10.1145/2213836.2213951, retrieved on 2018-03-19. [30] Yaoyuan Zhang, Zhenxu Ye, Yansong Feng et al. A constrained sequence-to-sequence neural model for sentence simpli?cation. arXiv preprint arXiv:1704.02312, 2017. [31] 谢文辉. 输出理论下改写训练在大学英语写作教学中的实证研究 [博士学位论文], 2013. [32] 吴锦, 张在新. 英语写作教学新探——论写前阶段的可行性. 外语教学与研究: 外国语文双月刊, 2000, 32(3): 213–218. [33] 颖孙. 输入理论视角下克服高中生英语写作中母语负迁移现象研究 [硕士学位论文], 2016. 检索于 2018-04-08. [34] 张燕. 高中英语写前活动的设计. 中小学外语教学, 2008, 1: 14–20. ﹀
馆藏号：	017/M2018(570)
公开日期：	2019-05-26

面向教育类视频的摘要生成技术研究与实现.帅远华

链接

题名：	面向教育类视频的摘要生成技术研究与实现
作者：	帅远华
学号：	1501210664
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	刘耀
导师单位：	中国科学技术信息研究所
第二导师姓名：	高志军
第二导师单位：	软件与微电子学院
答辩日期：	2018-05-26
题目(外文)：	Research and Implementation on Video Summarization for Educational Videos
关键字(中文)：	视频摘要教育类视频视频结构化分析视频语义分析视频文本摘要
关键字(外文)：	Video Summarization Educational Videos Video Structural Analysis Video Semantic Analysis Video Text Summarization
文摘：	︿随着互联网和多媒体设备的快速发展，视频数据已渗透到我们生活的方方面面，其中也包括教育领域。近几年在线教育的火热引起了教育类视频的快速增长，为用户的查询和浏览带来新的需求。如何从教育视频中提炼重要内容，让用户快速获取视频主要信息，因此便需要视频摘要技术。然而现有的针对教育类视频摘要的研究和应用较少，更缺乏对教育视频充分的结构化分析以及高层语义分析。在此背景下，本文分析前人研究成果，针对教育视频特点，研究了基于视频结构化分析和语义分析的教育视频文本摘要技术，从而生成信息覆盖面更全的视频摘要，帮助用户更高效浏览和获取教育视频主要内容。针对教育视频结构化分析不充分的问题，本文首先研究了教育类视频镜头分割以及关键帧提取方法。对于镜头分割，本文根据教育视频特点，针对HSV颜色直方图镜头分割方法进行了改进，通过视频分段预筛选缩短了镜头边界检测时长，并在最后加入pHash算法进一步提高镜头边界检测效果。针对教育视频关键帧提取，本文在镜头分割的基础上综合考虑视频帧的图像信息熵以及角点特征，选取更能够反映教育视频重要内容的视频帧作为关键帧。实验结果表明本文提出的方法能够有效缩短教育视频镜头检测时长，并且提取的视频关键帧冗余数和漏检数更低。在教育视频底层结构化分析后，本文进一步研究了教育视频语义分析相关技术，通过将关键帧角点特征和Hu不变矩特征作为联合向量训练SVM分类器，从而较好地区分了课件类以及非课件类关键帧。此外，本文提出了一种基于领域知识库的教育视频主题单元划分方法，通过引入领域知识库对镜头文本进行语义标注，从而挖掘教育视频潜在的子主题信息。最后通过实验证明了基于领域知识库的教育视频主题单元划分方法的有效性。本文在教育视频结构化分析以及语义分析的基础上，研究了教育视频文本摘要生成技术，通过划分后的视频主题单元对传统图模型的文摘方法进行了改进，一定程度上弥补了现有教育视频文本摘要研究的不足。本文使用领域语料训练Word2Vec模型，用于计算句子向量从而改进图模型中句子相似度的计算，并通过考虑句子是否包含教育视频核心术语以及线索词语对节点权重进行调整。实验使用ROUGE评价系统比较不同改进策略下生成的教育视频文本摘要，表明本文在主题单元基础上改进后的教育视频文本摘要方法能取得更为接近人工摘要的结果。﹀
文摘（外文）：	︿ With the rapid development of the Internet and multimedia devices, video data have penetrated into all aspects of our lives including educational field. In recent years, the prevalence of online education has caused the rapid growth of educational videos, which has brought new demands for users' browsing and querying. Video summarization techniques are needed in order to extract main content from educational videos and let users quickly obtain the important information. However, the existing research and applications for educational video summarization are few, and they lack the complete structural and high-level semantic analysis of educational videos. In this context, through analyzing the results of previous research and the characteristics of educational videos, this paper mainly studies video text summarization techniques based on video structural analysis and semantic analysis, so as to generate a more comprehensive video summary and help users to browse and access the main content of educational videos more quickly. To solve the problem of insufficient structural analysis of educational videos, this paper first studies the methods of video shot segmentation and key frame extraction. For the shot segmentation, we improve the shot segmentation method based on HSV color histogram according to the characteristics of educational videos. The pre-screening step of video clips is used to shorten the shot boundary detection duration. After the shot boundaries are detected, the pHash algorithm is used to further improve the results. For the extraction of video key frames, this paper considers the information entropy and the corner features of frames in shots, and selects video frames that can reflect the important content of educational videos as key frames. Experimental results show that the method proposed in this paper can effectively shorten the duration of educational video shot detection, and lower the number of redundant and missing key frames. After analyzing the underlying structure of educational videos, this paper further studies the related techniques of educational video semantic analysis. We train the SVM classifier by using the key frames’ corner feature and Hu invariant moment feature as joint vectors, thereby better distinguishing courseware and non-courseware key frames. In addition, this paper proposes a method to segment the topic units of educational videos based on domain knowledge base. By introducing the domain knowledge base to semantic annotating of shot texts, we try to mine the potential subtopic information of educational videos. And experimental results prove that the topic unit segmentation method of educational videos based on domain knowledge base is effective. Based on the structural and semantic analysis of educational videos, this paper studies the text summarization techniques of educational videos, and improves the traditional text summarization method of graph model through the guidance of video topic units, which to some extent makes up for the lack of research on the text summarization of educational videos. We use the domain corpus to train the Word2Vec model which is used to calculate the sentence vectors so as to improve the sentence similarity calculation. And then we consider whether the sentences contain core terms or clues of educational videos to adjust node weights. The experiments use ROUGE evaluation system to compare the video text abstract generated under different improvement strategies. The results show that the improved text summarization method based on topic units can get an abstract closer to the artificial one. ﹀
分类号：	TP3
论文总页数：	79
参考文献数：	84
参考文献：	︿ [1] Yadav K, Gandhi A, Biswas A, et al. ViZig: Anchor Points based Non-Linear Navigation and Summarization in Educational Videos[J]. 2016:407-418. [2] Rui Y, Huang T S, Mehrotra S. Exploring Video Structure Beyond The Shots[C]// IEEE International Conference on Multimedia Computing and Systems. IEEE Computer Society, 1998:237. [3] Petersohn C. Logical unit and scene detection: a comparative survey[J]. Proceedings of SPIE - The International Society for Optical Engineering, 2008:682002-682002-17. [4] Yuan J, Wang H, Xiao L, et al. A Formal Study of Shot Boundary Detection[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2007, 17(2):168-186. [5] Zhang Z P, Liu K, Peng J H. Shot Boundary Detection Based on Histogram of Mismatching-Pixel Count of FMB[C]// Industrial Electronics and Applications, 2006, IEEE Conference on. IEEE, 2006:1-5. [6] 俞璐, 乔瑞萍, 胡宇平,等. 基于颜色直方图的快速镜头分割方法[J]. 微电子学与计算机, 2014(2):72-76. [7] Fanan N G, Khobragade A S. A fast robust technique for video shot boundary detection[C]// Online International Conference on Green Engineering and Technologies. IEEE, 2017:1-6. [8] Zongjie Li, Xiabi Liu, Shuwen Zhang. Shot Boundary Detection based on Multilevel Difference of Colour Histograms[C]// International Conference on Multimedia & Image Processing. IEEE, 2016:15-22. [9] 陈锦. 基于特征学习的视频摘要算法研究[D]. 北京:北京大学, 2017. [10] Tong W, Song L, Yang X, et al. CNN-based shot boundary detection and video annotation[C]// IEEE International Symposium on Broadband Multimedia Systems and Broadcasting. IEEE, 2015:1-5. [11] Ejaz N, Tariq T B, Baik S W. Adaptive key frame extraction for video summarization using an aggregation mechanism. J Vis Commun Image Represent[J]. Journal of Visual Communication & Image Representation, 2012, 23(7):1031-1040. [12] Avila S E F D. VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method[J]. Pattern Recognition Letters, 2011, 32(1):56-68. [13] Sheena C V, Narayanan N K. Key-frame Extraction by Analysis of Histograms of Video Frames Using Statistical Methods[J]. Procedia Computer Science, 2015, 70: 36-40. [14] 丁洪丽, 陈怀新. 基于镜头内容变化率的关键帧提取算法[J]. 计算机工程, 2009, 35(13):225-227. [15] Algur S P. Video Key Frame Extraction using Entropy value as Global and Local Feature[J]. arXiv preprint arXiv:1605.08857, 2016. [16] Lai J L, Yi Y. Key frame extraction based on visual attention model[J]. Journal of Visual Communication & Image Representation, 2012, 23(1):114-125. [17] Drew M S, Au J. Video keyframe production by efficient clustering of compressed chromaticity signatures (poster session)[C]// Eighth ACM International Conference on Multimedia. ACM, 2000:365-367. [18] Srinivas M, Pai M M M, Pai R M. An Improved Algorithm for Video Summarization – A Rank Based Approach ☆[J]. Procedia Computer Science, 2016, 89:812-819. [19] Liu H, Li T. Key frame extraction based on improved frame blocks features and second extraction[C]// International Conference on Fuzzy Systems and Knowledge Discovery. IEEE, 2015:1950-1955. [20] Wolf W. Key frame selection by motion analysis[C]//Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on. IEEE, 1996, 2: 1228-1231 [21] 朱松豪. 视频摘要技术的研究[D]. 上海:上海交通大学, 2009. [22] Zhu S, Liu Y. Video scene segmentation and semantic representation using a novel scheme[J]. Multimedia Tools & Applications, 2009, 42(2):183-205. [23] 刘嘉琦, 封化民, 闫建鹏. 基于多模态特征融合的新闻故事单元分割[J]. 计算机工程, 2012, 38(24):161-165. [24] Xu S, Feng B, Xu B. Multi-modal topic unit segmentation in videos using conditional random fields[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013:2287-2291. [25] Sidiropoulos P, Mezaris V, Kompatsiaris I, et al. Multi-modal scene segmentation using scene transition graphs[C]// International Conference on Multimedia 2009, Vancouver, British Columbia, Canada, October. DBLP, 2009:665-668. [26] Baraldi L, Grana C, Cucchiara R. Scene segmentation using temporal clustering for accessing and re-using broadcast video[C]// IEEE International Conference on Multimedia and Expo. IEEE, 2015:1-6. [27] Chen B W, Wang J C, Wang J F. A Novel Video Summarization Based on Mining the Story-Structure and Semantic Relations Among Concept Entities[J]. IEEE Transactions on Multimedia, 2009, 11(2):295-312. [28] 王廉. 多模态教学视频语义分析及实现[D]. 江苏：南京理工大学, 2014. [29] Imran A S, Cheikh F A. Blackboard content classification for lecture videos[C]// IEEE International Conference on Image Processing. IEEE, 2011:2989-2992. [30] Fan Q, Amir A, Barnard K, et al. Temporal modeling of slide change in presentation videos[J]. 2015, 1:989-992. [31] Bai L, Lao S, Guo J. Video semantic concept detection using ontology[C]// Icimcs 2011, the Third International Conference on Internet Multimedia Computing and Service, Chengdu, China, August. DBLP, 2011:158-163. [32] Ouyang J Q, Liu R. Ontology reasoning scheme for constructing meaningful sports video summarisation[J]. Iet Image Processing, 2013, 7(4):324-334. [33] Ballan L, Bertini M, Bimbo A D, et al. Video Annotation and Retrieval Using Ontologies and Rule Learning[J]. IEEE Multimedia, 2010, 17(4):80-88. [34] Nandhini R P R, Valarmathie P. Crucial video content extraction using ontology rule-based technology and decision making algorithm[C]// International Conference on Computer Communication and Systems. IEEE, 2015:081-085. [35] 李佳桐. 自适应视频摘要算法研究[D]. 安徽：中国科学技术大学, 2017. [36] Guan G, Wang Z, Lu S, et al. Keypoint-Based Keyframe Selection[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2013, 23(4):729-734. [37] Ranjan R K, Agrawal A. Video Summary Based on F-Sift, Tamura Textural and Middle Level Semantic Feature [J]. Procedia Computer Science, 2016, 89:870-876. [38] Fei M, Jiang W, Mao W. Memorable and rich video summarization [J]. Journal of Visual Communication & Image Representation, 2017, 42:207-217. [39] 赵树娟. 基于多模态融合的讲座类视频摘要提取的方法设计与研究[D]. 北京:北京大学, 2014. [40] Chang W H, Yang J C, Wu Y C. A Keyword-based Video Summarization Learning Platform with Multimodal Surrogates[C]// IEEE, International Conference on Advanced Learning Technologies. IEEE Computer Society, 2011:37-41. [41] 赵洋洋, 徐常胜, 梁超. 基于文本的自动视频摘要[C]//第七届和谐人机环境联合学术会议 (HHME2011) 论文集 [oral]. 2011. [42] Evangelopoulos G, Zlatintsi A, Potamianos A, et al. Multimodal Saliency and Fusion for Movie Summarization Based on Aural, Visual, and Textual Attention[J]. IEEE Transactions on Multimedia, 2013, 15(7):1553-1568. [43] Sah S, Kulhare S, Gray A, et al. Semantic Text Summarization of Long Videos[C]// Applications of Computer Vision. IEEE, 2017:989-997. [44] Choi J, Oh T H, Kweon I S. Textually Customized Video Summaries[J]. arXiv preprint arXiv:1702.01528, 2017. [45] Molino A G D, Boix X, Lim J H, et al. Active Video Summarization: Customized Summaries via On-line Interaction with the User[C]// AAAI Conference on Artificial Intelligence. 2017. [46] Panda R, Roy-Chowdhury A K. Sparse modeling for topic-oriented video summarization[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2017:1388-1392. [47] Li G, Ma S, Han Y. Summarization-based Video Caption via Deep Neural Networks[C]// ACM International Conference on Multimedia. ACM, 2015:1191-1194. [48] Christian H, Agus M P, Suhartono D. Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF)[J]. ComTech: Computer, Mathematics and Engineering Applications, 2016, 7(4): 285-294. [49] Edmundson H P, Wyllys R E. Automatic abstracting and indexing—survey and recommendations[J]. Communications of the Acm, 1961, 4(5):226-234. [50] 傅间莲, 陈群秀. 自动文摘系统中的主题划分问题研究[J]. 中文信息学报, 2005, 19(6):28-35. [51] Mihalcea R. TextRank : Bringing order into texts[J]. EMNLP, 2004, 2004:404-411. [52] Erkan, Radev, Dragomir R. LexRank: graph-based lexical centrality as salience in text summarization[J]. Journal of Qiqihar Junior Teachers College, 2004, 22:2004. [53] Garg N, Favre B, Reidhammer K, et al. Clusterrank: a graph based method for meeting summarization[C]//Tenth Annual Conference of the International Speech Communication Association. 2009. [54] Yu S, Su J, Li P, et al. Towards high performance text mining: a TextRank-based method for automatic text summarization[J]. International Journal of Grid and High Performance Computing (IJGHPC), 2016, 8(2): 58-75. [55] Patricia Nunes Gon?alves, Rino L, Vieira R. Summarizing and referring:towards cohesive extracts[C]// ACM Symposium on Document Engineering, Sao Paulo, Brazil, September. DBLP, 2008:253-256. [56] Cunha I D, Fernández S, Morales P V, et al. A New Hybrid Summarizer Based on Vector Space Model, Statistical Physics and Linguistics[M]// MICAI 2007: Advances in Artificial Intelligence. Springer Berlin Heidelberg, 2007:872-882. [57] Paulus R, Xiong C, Socher R. A deep reinforced model for abstractive summarization[J]. arXiv preprint arXiv:1705.04304, 2017. [58] Rush A M, Chopra S, Weston J. A Neural Attention Model for Abstractive Sentence Summarization[J]. Computer Science, 2015. [59] Verma S, Nidhi V. Extractive Summarization using Deep Learning[J]. arXiv preprint arXiv:1708.04439, 2017. [60] Shao H, Qu Y, Cui W. Shot boundary detection algorithm based on HSV histogram and HOG feature[C]//5th International Conference on Advanced Engineering Materials and Technology. 2015: 951-957. [61] Tripathi G. Review on color and texture feature extraction techniques[J]. International Journal of Enhanced Research in Management and Computer Applications, 2014, 3(5): 77-81. [62] Harris C. A combined corner and edge detector[J]. Proc Alvey Vision Conf, 1988, 1988(3):147-151. [63] Smith S M, Brady J M. SUSAN—A New Approach to Low Level Image Processing[J]. International Journal of Computer Vision, 1997, 23(1):45-78. [64] Lowe D G. Object Recognition from Local Scale-Invariant Features[C]// iccv. IEEE Computer Society, 1999:1150. [65] Bay H, Tuytelaars T, Van Gool L. Surf: Speeded up robust features[J]. Computer vision–ECCV 2006, 2006: 404-417. [66] Li Y N, Lu Z M, Niu X M. Fast video shot boundary detection framework employing pre-processing techniques[J]. IET image processing, 2009, 3(3): 121-134. [67] Zhang H J, Kankanhalli A, Smoliar S W. Automatic partitioning of full-motion video[J]. Multimedia Systems, 1993, 1(1):10-28. [68] Shannon C E, Weaver W. The mathematical theory of communication[J]. M.d.computing Computers in Medical Practice, 1950, 3(9):31-32. [69] Ren L, Qu Z, Niu W, et al. Key frame extraction based on information entropy and edge matching rate[C]// International Conference on Future Computer and Communication. IEEE, 2010:V3-91-V3-94. [70] Angadi S, Naik V. Entropy Based Fuzzy C Means Clustering and Key Frame Extraction for Sports Video Summarization[C]// Fifth International Conference on Signal and Image Processing. IEEE Computer Society, 2014:271-279. [71] Rosten E, Drummond T. Machine learning for high-speed corner detection[J]. Computer Vision–ECCV 2006, 2006: 430-443. [72] Muhammad K, Sajjad M, Mi Y L, et al. Efficient visual attention driven framework for key frames extraction from hysteroscopy videos[J]. Biomedical Signal Processing & Control, 2017, 33:161-168. [73] 瞿中, 高腾飞, 张庆庆. 一种改进的视频关键帧提取算法研究[J]. 计算机科学, 2012, 39(8):300-303. [74] Hu M. Visual pattern recognition by moment invariants[J]. Information Theory Ire Transactions on, 1962, 8(2):179-187. [75] 郑德举. 基于WEB的语义元数据辅助构建平台关键技术研究与实现[D]. 北京:北京大学, 2013. [76] Studer R, Benjamins V R, Fensel D. Knowledge engineering: principles and methods[J]. Data & Knowledge Engineering, 2010, 25(1–2):161-197. [77] Hearst M A. TextTiling: segmenting text into multi-paragraph subtopic passages[M]. MIT Press, 1997. [78] Tibbo H R. The art of abstracting: 2nd ed. E. T. CRIMMINS. Information Resources Press, Arlington, Vir. (1996). xvii + 230 pp. ISBN 0-87815-066-8, $34.95[J]. Information Processing & Management, 1997, 33(4):573. [79] Endres-Niggemeyer B, Neugebauer E. Professional summarizing: no cognitive simulation without observation[J]. Journal of the American Society for Information Science, 1998, 49(6):486–506. [80] Page L. The PageRank citation ranking : Bringing order to the web[J]. Stanford Digital Libraries Working Paper, 1998, 9(1):1-14. [81] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[J]. Computer Science, 2013. [82] Le Q V, Mikolov T. Distributed Representations of Sentences and Documents[J]. 2014, 4:II-1188. [83] Arora S, Liang Y, Ma T. A simple but tough-to-beat baseline for sentence embeddings[J]. 2016. [84] Lin C Y, Hovy E. Automatic evaluation of summaries using N-gram co-occurrence statistics[C]// Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Association for Computational Linguistics, 2003:71-78. ﹀
馆藏号：	017/M2018(432)
公开日期：	2021-05-26

面向专业领域的自动综述关键技术研究.涂梦

链接

题名：	面向专业领域的自动综述关键技术研究
作者：	涂梦
学号：	1501210690
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	刘耀
导师单位：	中国科学技术信息研究所
第二导师姓名：	高志军
第二导师单位：	软件与微电子学院
答辩日期：	2018-05-26
题目(外文)：	Research and Implementation on Automatic Review in Specific Domain
关键字(中文)：	自动综述文本解析语义标注文献检索
关键字(外文)：	Automatic Review Text Analysis Semantic Annotation Document Retrieval
文摘：	︿文本自动综述是自动摘要的典型应用，同时也是多文档自动摘要进一步发展的结果。便于用户方便快捷的阅读和了解多个文本的主要内容，具有效率高、覆盖面广、速度快等人工撰写综述所没有的特性。针对特定领域的自动综述问题，如何将人类的先验知识与文本自动综述相结合具有重要研究价值。近年来，知识图谱研究热度的上升使得本体等结构化语义知识库重新焕发生机。因此，本文以护理领域学术文献为研究对象，将实现特定领域文献的综述自动生成为目标，以结构化语义知识为主要研究工具，在分析了综述的文本特征并构建自动综述模型的基础上，重点研究了高质量文献的自动获取、文本内容自动解析、综述文本自动生成三大关键技术。首先，研究了基于语义的文献自动检索技术。通过引入背景语义知识，进而建立了文献语义检索模型。通过为文献信息资源提供语义标注信息,使系统对领域内的概念、概念之间的关系具备统一的认识,从而显著地提高结果的语义相关性。同时，在实现语义相关性的基础上引入权威度评价，通过构建合理的文献评价指标体系，对检索结果进行综合排序，筛选出最有价值的参考文献。其次，研究了基于语义的文本内容自动解析技术。通过领域本体的构建，获取背景语义知识。利用三元组语义知识对文献资源进行语义标注，完成句子颗粒度的文本内容解析，识别出文献中每个句子的主题特征。然后，基于文献自动检索和文本自动解析的结果，提出了一种基于语义的自动综述生成方法。结合文献资源的形式语义（结构）和内容语义（主题）实现综述子主题的自动演化，综合考虑子主题的属性和文献覆盖度进行排序，并对应抽取得分最高的句子作为中心句，最终根据综述结构模型以及主题演化结果，实现综述文本的自动生成。最后，本文通过实验论证了上述方法的有效性，设计并实现了面向护理领域的文献自动综述系统，该系统支持语料及知识爬取、文献语义检索、综述自动生成、人机交互修改等功能，满足了针对特定领域的学术文献自动综述的需求。﹀
文摘（外文）：	︿ Automatic text review is a typical application of automatic summarization, and it is also the result of the further development of multidocument automatic summarization. It is convenient for users to read and understand the main contents of multiple texts with features such as high efficiency, wide-coverage, and high-speed that are not included in the human-writing review. For specific fields, it’s very important to combine human priori knowledge with automatic text review. In recent years, the rising popularity of knowledge graph has revived structured semantic knowledge bases such as ontology. Therefore, this paper takes the academic literature in the field of nursing as the research object, aims to automatically generate a review of the literature, uses structured semantic knowledge as the main research tool, analyzes the text features of the review, and builds an automatic review model. It focuses on three key technologies: automatic retrieval of high-quality documents, automatic analysis of text content, and automatic generation of review texts. Firstly, the semantic-based automatic retrieval of documents is studied. By introducing background semantic knowledge, a semantic retrieval model of literature was established. By providing semantic annotation information for document information resources, the system has a unified understanding of the relationships between concepts and concepts in the domain, thereby significantly improving the semantic relevance of the results. At the same time, based on the realization of semantic relevance, the evaluation of authoritativeness was introduced. By constructing a reasonable literature evaluation index system, the search results were comprehensively sorted and the most valuable references were selected. Secondly, the semantic analysis of text content based on automatic parsing is studied. Through the construction of domain ontology, background semantic knowledge is acquired. Semantic annotation of literature resources is done using triplet semantic knowledge to complete textual content analysis of sentence granularity and identify the subject of each sentence in the literature. Then, based on the results of automatic retrieval of documents and automatic analysis of texts, a semantic-based automatic review generation method is proposed. Combine the formal semantics (structure) and content semantics (topics) of literature resources to achieve the automatic derivation of the review subtopics, and comprehensively consider the subtopical attributes and document coverage to rank, and correspondingly extract the sentence with the highest score as the central sentence, and finally according to the review Structural models and the results of topic derivation enable the automatic generation of summary texts. Finally, this paper demonstrates the effectiveness of the above methods through experiments, and designs and implements an automatic literature review system for the nursing field. This system supports functions such as corpus and knowledge crawling, document semantic retrieval, automatic generation of reviews, and human-computer interaction modification, meeting the need for automatic review of academic literature in specific areas. ﹀
分类号：	TP3
论文总页数：	77
参考文献数：	58
参考文献：	︿ [1] Jain A K, Murty M N, Flynn P J. Data clustering: a review [J]. ACM computing surveys (CSUR), 1999, 31(3): 264-323. [2] Radev D R, Jing H, Sty? M, et al. Centroid-based summarization of multiple documents [J]. Information Processing & Management, 2004, 40(6):919-938. [3] Xia Y, Zhang Y, Yao J. Co-clustering sentences and terms for multi-document summarization[C]//International Conference on Intelligent Text Processing and Computational Linguistics. Springer, Berlin, Heidelberg, 2011: 339-352. [4] Nie Y, Ji D, Yang L, et al. Multi-document summarization using a clustering-based hybrid strategy[C]//Asia Information Retrieval Symposium. Springer, Berlin, Heidelberg, 2006: 608-614. [5] Aliguliyev R M. Clustering Techniques and Discrete Particle Swarm Optimization Algorithm for Multi‐Document Summarization [J]. Computational Intelligence, 2010, 26(4): 420-448. [6] Lin C Y, Hovy E. From single to multi-document summarization: a prototype system and its evaluation[C]// Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2002:457-464. [7] Mckeown K R, Barzilay R, Evans D, et al. Tracking and summarizing news on a daily basis with Columbia's Newsblaster[C]// International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc. 2002:280-285. [8] Flores J G, Chalendar G D. Syntactico-Semantic Analysis: A Hybrid Sentence Extraction Strategy for Automatic Summarization[C]// Seventh Mexican International Conference on Artificial Intelligence. IEEE Computer Society, 2008:31-36. [9] Nasir S A M, Noor N L M. Automating the mapping process of traditional malay textile knowledge model with the core ontology [J]. American Journal of Economics and Business Administration, 2011, 3(1): 191-196. [10] Shareha A A A, Rajeswari M, Ramachandram D. Multimodal integration (image and text) using ontology alignment [J]. American Journal of Applied Sciences, 2009, 6(6): 1217. [11] Khelif K, Dieng-Kuntz R, Barbry P. An Ontology-based Approach to Support Text Mining and Information Retrieval in the Biological Domain [J]. J. UCS, 2007, 13(12): 1881-1907. [12] Li L, Wang D, Shen C, et al. Ontology-enriched multi-document summarization in disaster management[C]//Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 2010: 819-820. [13] Verma R, Chen P, Lu W. A semantic free-text summarization system using ontology knowledge[C]//Proc. of Document Understanding Conference. 2007. [14] Kogilavani A A, Balasubramanie B D P. Ontology enhanced clustering based summarization of medical documents [J]. International Journal of Recent Trends in Engineering, 2009, 1(1): 546-549. [15] Wu C W, Liu C L. Ontology-based Text Summarization for Business News Articles [J]. Computers and their applications, 2003, 2003: 389-392. [16] Hennig L, Umbrath W, Wetzker R. An ontology-based approach to text summarization[C]//Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology-Volume 03. IEEE Computer Society, 2008: 291-294. [17] Luhn H P. The automatic creation of literature abstracts[M]. IBM Corp. 1958. [18] Minghui W, Tanaka H, Zhong Y. Generating Summaries of Multiple Technical Articles[J]. [19] 郑义, 黄萱菁, 吴立德. 文本自动综述系统的研究与实现[J]. 计算机研究与发展, 2003, 40(11):1606-1611. [20] 赵林. 面向查询的多文档自动文摘关键技术研究[D]. 复旦大学, 2008. [21] 杨潇, 马军, 杨同峰,等. 主题模型LDA的多文档自动文摘[J]. 智能系统学报, 2010, 5(2):169-176. [22] 纪文倩, 李舟军, 巢文涵,等. 一种基于LexRank算法的改进的自动文摘系统[J]. 计算机科学, 2010, 37(5):151-154. [23] 薛竹君. 面向网络媒体的文本自动综述技术的研究与实现[D].国防科学技术大学,2015. [24] Wang M, Tanaka H, Zhong Y. Generating Summaries of Multiple Technical Articles[C]//Proc. of Sino-Japan Symposium on IIN. 2000. [25] 葛加银, 黄萱菁, 吴立德. 基于实体名的文本自动综述研究[J]. 计算机科学, 2004, 31(9):161-164. [26] 张岩. 基于本体的综合评价文本自动生成系统研究[D]. 中国石油大学, 2009. [27] Mohammad S, Dorr B, Egan M, et al. Using citations to generate surveys of scientific paradigms[C]//Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2009: 584-592. [28] Qazvinian V, Radev D R. Identifying Non-explicit Citing Sentences for Citation-based Summarization[C]// ACL 2010, Proceedings of the, Meeting of the Association for Computational Linguistics, July 11-16, 2010, Uppsala, Sweden. DBLP, 2010:555-564. [29] 王思聪. 面向科技领域的多文档摘要与综述报告自动生成[D]. 东北大学, 2012. [30] Qazvinian V, Radev D R, Mohammad S M, et al. Generating extractive summaries of scientific paradigms[J]. Journal of Artificial Intelligence Research, 2014, 46(1):165-201. [31] 张占江. 基于短语主题模型和多文档自动摘要技术的文献综述内容推荐[D]. 浙江大学, 2016. [32] 化柏林. 基于句子匹配分析的知识抽取[M]. 科学技术文献出版社, 2014. [33] Creswell J W. Research Design: Qualitative, Quantitative, and Mixed Methods Approaches[M]. 2009. [34] 韩映雄, 马扶风. 文献综述及其撰写[J]. 出版与印刷, 2017(1):64-69. [35] 劳伦斯?马奇, 布伦达?麦克伊沃. 怎样做文献综述:六步走向成功[M]. 上海教育出版社, 2011. [36] 秦慧,张品南. 医学论文写作与投稿技巧[ J]. 中国医疗前沿,2008,3(6):51 – 52 [37] 李劲松. 生物医学语义技术[M]. 浙江大学出版社, 2012. [38] 曹妍, 朱瑞芳, 韩世范. 应用德尔菲法构建护理论文创新性评价指标体系[J]. 护理研究, 2017, 31(17):2101-2103. [39] 程姗姗, 赵秋利, 仰曙芬,等. 护理学硕士研究生学位论文评价指标体系的构建[J]. 护理学杂志, 2016, 31(1):10-13. [40] 白如江, 杨京, 王效岳. 单篇学术论文评价研究现状与发展趋势[J]. 情报理论与实践, 2015, 38(11):11-17. [41] 张玉华, 潘云涛, 马峥. 科技论文评估方法研究[J]. 编辑学报, 2004, 16(4):243-244. [42] 金碧辉, 汪寿阳. 论期刊影响因子与论文学术质量的关系[J]. 中国科技期刊研究, 2000, 11(4):202-205. [43] 曹兴, 周密, 刘芳. 网络科技论文学术影响力评价指标体系研究[J]. 科学决策, 2010(7):30-37. [44] 韩晓, 郭嘉丰, 杜攀,等. 一种面向权威度和多样性的自动学术调研框架[J]. 计算机学报, 2015, 38(2):365-373. [45] Zeng M. Development of an Automated Indexing System Based on Chinese Words Segmentation (CWSAIS) and Its Application [J]. Journal of Physics D Applied Physics, 1978, 11(11):899-912. [46] Goldwater S, Griffiths T L, Johnson M. Contextual dependencies in unsupervised word segmentation[C]// International Conference on Computational Linguistics and the, Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2006:673-680. [47] 刘群, 张华平, 俞鸿魁,等. 基于层叠隐马模型的汉语词法分析[J]. 计算机研究与发展, 2004, 41(8):1421-1429. [48] 郑德举. 基于WEB的语义元数据辅助构建平台关键技术研究与实现[D]. 北京大学, 2013. [49] 刘耀, 穗志方, 胡永伟,等. 基于内容与形式交互的图书馆资源组织语义化方法研究[J]. 情报理论与实践, 2010(10):105-107. [50] GONG, XINGWEI, and YAO LIU. "RESEARCH ON CONSTRUCTION OF INTEGRATED SEMANTIC CRAWLER." ICIC express letters. Part B, Applications: an international journal of research and surveys 7.7 (2016): 1591-1598. [51] 李晓辉. 文档集自动综述方法研究[D]. 长春工业大学, 2006. [52] 索红光, 安迪, 李健. 基于名实体的新闻专题自动综述系统研究与实现[J]. 情报学报, 2010, 29(1):32-37. [53] 车海燕, 孙吉贵, 荆涛,等. 一个基于本体主题的中文知识获取方法[J]. 计算机科学与探索, 2007, 1(2):206-215. [54] Zhang L. Grasping the Structure of Journal Articles: Utilizing the Functions of Information Units [J]. Journal of the Association for Information Science & Technology, 2012, 63(3):469–480. [55] Jing H. Summarization Evaluation Methods: Experiments and Analysis [J]. Intelligent Text Summarization, 1998, 2(1227):60--68. [56] Marcu D. The automatic construction of large-scale corpora for summarization research[C]//Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 1999: 137-144. [57] Lin C Y. Rouge: A package for automatic evaluation of summaries [J]. Text Summarization Branches Out, 2004. [58] Vanderwende L, Suzuki H, Brockett C, et al. Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion [J]. Information Processing & Management, 2007, 43(6): 1606-1618. ﹀
馆藏号：	017/M2018(616)
公开日期：	2021-05-26

2018-05-25

面向显隐式语法教学的学习材料加工和教学优化研究.林凤怡

链接

题名：	面向显隐式语法教学的学习材料加工和教学优化研究
姓名：	林凤怡
学号：	1501210604
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-25
关键词：	二语语法习得自主学习英语分级阅读显隐式教学
论文摘要：	︿作为语言的三大要素之一，语法的重要地位在学界已广受认可，但目前的语法教学仍存在较大改善空间。语法或被完全轻视或重理论轻应用；学习材料普遍无法为学习者提供有吸引力的背景材料和可促进学习者自主学习的学习环境；整体学习进程几乎完全在教师的一致带领下进行，学习者自主性过低。针对以上问题，本研究以高中英语语法教学作为切入点，基于二语习得的相关理论以及前人在语法学习材料和教学方式领域的已有探讨，运用斯坦福句法剖析器、特征抽取系统、篇章排序系统和语法识别系统等工具，以分级阅读要求为标准，从学习材料和教学流程两个方面入手，优化现有语法教学。学习材料方面，研究以原生在线语料为选材来源，筛选并归类出不同题材、难度适宜的篇章。同时，研究对材料中的目标语法进行标注，并辅以源自原生语料、题材相同（近）、难度适宜、包含目标语法的小语境式类比例句，从而保证目标语法的出现密度，引导学习者主动注意目标语法，提高学习者的接受度，促进学习者自主学习。此外，研究以国外经典教材为原材料，精挑出内容契合、形式多样的实践练习，促使学习者积极运用所学语法，提高语言实践能力。教学流程方面，研究将学习者定义为课堂中心，将教师定义为引路人，希望通过双方合力，充分挖掘学习者的主观能动性，将语法学习变为高效、自主、有趣的学习过程。研究在安徽省合肥市某高级中学进行了为期6周的教学实验，以60名高二年级学生为实验对象，分为实验组和对照组，分别使用研究编纂学习材料和传统材料学习5个相同的目标语法，最后用同一测试材料检验学习效果。此外，研究还通过调查问卷和当面访谈的形式了解学习者和教师对两种学习模式的主观感受。结果表明：本研究编纂的语法学习材料可以帮助学习者取得更好的学习效果，可以满足学习者的不同爱好、贴合学习者的阅读水平、提高学习者的主观能动性、深化学习者的应用能力，促进学习者愉悦高效地掌握目标语法，有助于学习者突破“学习”，迈向“习得”。﹀
外文摘要：	︿ Grammar, as one of three essential factors, is a key part in constructing language, yet limitations of present grammar teaching still exist. Many native English teachers are paying little attention to grammar, while others are always emphasizing on mere theoretical part. Centering around the teacher, students with existing materials, which are not really playing its role, always have few subjective initiatives and learning passions. Focusing on high school grammar learning and by using related automatic tools, like Stanford Parser, FeatureMagic, DiscourseRankMagic and grammar test system, the study compiles a set of grammar learning materials and designs a supplementary teaching route in order to give full play to students’ autonomous learning, with the demands set in curriculum standard and test syllabus. Theoretical and empirical fruits of second language acquisition and students’ subjective activities also provide guidance to the study. The learning materials, which are selected from online native newspaper archives, are picked out by the demands of graded reading in English and sorted into various attractive themes. In addition, they are marked in different styles according to the concrete target items so as to grab students’ attention in a more active way. A series of sample sentences which can be classified into the same topic are also attached, providing more necessary help for students’ exploration and summary. Regards to the exercises, communicative exercises, which can develop the pragmatic awareness and fluency more efficiently, are added to the traditional ones. At the same time, a teaching route is designed as the supplement of the materials for better outcomes, hoping to fully exert students’ subjective activities and to make grammar learning process be more efficient, more active and more pleasant. The study conducts an empirical study among 60 high school students in Hefei, Anhui, for a period of 6 weeks. Participants are divided into treatment and control groups, using the compiled learning materials with new teaching method and common materials with conventional method respectively. Through tests and interviews, the study intends to verify the effectiveness and acceptance of the proposed learning materials and teaching methods. The results validate that feedbacks, either the objective outcomes or the subjective attitudes, are more positive in treatment group. Participants in treatment group express that they have more intention to find out and explore questions by themselves as the whole learning process can provoke their learning enthusiasm and enable them to form a benign cycle between knowledge acquirement and consolidation, which, in turn, helps them a lot to truly acquire grammar items. ﹀
分类号：	H314/G633
论文总页数：	88
参考文献总数：	80
参考文献列表：	︿ [1] 林语堂. 英文学习法[OL]. (2015-02-04) [2017-12-12]. http://www.cssn.cn/gx/wy/201502/ t20150204 _ 1504789.shtml. [2] Bolinger D. Meaning and Form [M]. London: Longman, 1977. [3] Diane L F, Marianne C M. Grammar Dimensions: Form, Meaning and Use (2nd edition) [M]. Boston, MA: Heinle and Heinle, 1997. [4] Nassaji H, Fotos S. Current Developments in Research on the Teaching of Grammar [J]. Annual Review of Applied Linguistics, 2004(24): 126-145. [5] Ellis R. The study of second language acquisition[M]. Oxford: Oxford University Press, 1994. [6] Winitz H. Grammaticality judgment as a function of explicit and implicit instruction in Spanish [J]. The Modern Language Journal, 1996, 80(2): 32 – 46. [7] 刘丹丹, 吴艳. 显性／隐性强化对二语词汇和语法习得的影响研究[J]. 外语界, 2014(5): 57-66. [8] 朱茜, 徐锦芬. 国外优秀英语教材词汇和语法的布局、复现及练习方式[J]. 外语教学理论与实践. 2014(4): 25-33. [9] Diane L F. Teaching Language: From Grammar to Grammaring[M]. Boston: Heinle, Cengage Learning, 2003. [10] Diane L F, Michael H L. An Introduction to Second Language Acquisition Research[M]. London: Routledge, 1991. [11] James E P. Assessing Grammar[J]. The companion to language assessment, 2013(11): 100-124. [12] 中华人民共和国教育部. 义务教育英语课程标准（2011年版）[S]. 北京: 北京师范大学出版社, 2012. [13] Lauren B G, Laura B S. The Histories and Mysteries of Grammar Instruction[J]. The Reading Teacher, 2016, 69(4): 391-399. [14] 崔红梅. 任务型语法教学的理论研究与实践[D]. 华东师范大学, 2006. [15] Myhill D, Jones S, Watson A, Grammar Matters: How Teachers' Grammatical Knowledge Impacts on the Teaching of Writing[J]. Teaching and Teacher Education, 2013(36): 77-91. [16] Nunan D. Designing Tasks for the Communicative Classroom[M]. Cambridge: CUP, 1989: 97-101. [17] 高宏. 高中英语显性与隐性语法教学整合的研究[D]. 天津师范大学, 2012. [18] Miros?aw P. The Role of Autonomy in Learning and Teaching Foreign Language Grammar[A] // Miros?aw P, Anna M W, Jakub B. Autonomy in Second Language Learning: Managing the Resources[M]. Switzerland: Springer International Publishing, 2017: 3-19. [19] Sharwood S M. Can you Learn to Love Grammar and so Make it Grow? On the Role of Affect in L2 Development[A] // Miros?aw P, Aronin L. Essential Topics in Applied Linguistics and Multilingualism[C]. Switzerland: Springer International Publishing, 2014:3-20. [20] 杜小红. 从强调句的用法谈英语语法自主学习[J]. 海外英语, 2011(6x): 9-10. [21] Cunningsworth A. Choosing Your Coursebook[M]. Oxford: Heinemann, 1995. [22] 孙智昌. 主体相关性: 教科书设计的基本原理[M]. 北京: 教育科学出版社, 2011. [23] 刘道义. 普通高中课程标准实验教科书英语（必修1-5）[M]. 北京: 人民教育出版社, 2008. [24] 王蔷, Michael Harris. 普通高中课程标准实验教科书英语（必修1-5）[M]. 北京: 北京师范大学出版社, 2008. [25] 陈琳, Simon Greenall. 新标准英语（必修1-5）[M]. 上海: 外语教学与研究出版社, 2008. [26] 章振邦. 新编英语语法教程[M]. 上海: 上海外语教育出版社, 2011. [27] 张道真. 张道真英语丛书——英语实用语法[M]. 北京: 首都师范大学出版社, 2011. [28] 薄冰. 高级英语语法[M]. 北京: 世界知识出版社, 2002. [29] 贾冠杰. 第二语言习得理论之间的矛盾统一性[J]. 外语与外语教学, 2004(12):35-36. [30] Krashen S D. The Input Hypothesis: Issue and Implications[M]. London: Longman, 1985. [31] 杨鲁新. 输出假设理论: 历史与未来——Merrill Swain教授专访[J]. 外研之声, 2008(2): 26-29. [32] 中华人民共和国教育部. 普通高中英语课程标准（实验稿）[S]. 北京: 人民教育出版社, 2008 [33] 中华人民共和国教育部. 普通高等学校招生全国统一考试大纲英语[S]. 北京: 人民教育出版社, 2017. [34] Schimidt R. The Role of Consciousness in Second Language Learning[J]. Applied Linguistics, 1990, 11(2): 129-158. [35] Skehan P A, Cognitive Approach to Language Learning[M]. Oxford: Oxford University Press, 1998. [36] VanPatten B. Processing Instruction: An Update[J]. Language Learning. 2002(4): 755-803. [37] VanPatten B. Input processing in adult second language acquisition[A] // VanPatten B, Williams J. Theories in Second Language Acquisition: An Introduction (Second Language Acquisition Series)[M]. Mahwah. NJ, US: Lawrence Erlbaum Associates Publishers. 2007: 115-135. [38] VanPatten B, Oikkenon S. Explanation Versus Structures Input in Processing Instruction[J]. Studies in Second Language Acquisition, 1996, 18(4): 495-510. [39] Little D. Learner Autonomy: Definitions, Issues and Problems[M]. Dublin: Authentik, 1991. [40] Little D. Autonomy in language learning: Some theoretical and practical considerations[A] // Gathercole I. Autonomy in Language Learning[M]. London: Center for Information on Language Teaching and Research, 1990: 7-15. [41] Dickson L. Autonomy and Motivation: A Literature Review[J]. System, 1995, 23(2): 165-174. [42] Miros?aw P. The Role of Autonomy in Learning and Teaching Foreign Language Grammar[A] // Miros?aw P, Anna M W, Jakub B. Autonomy in Second Language Learning: Managing the Resources[M]. Switzerland: Springer International Publishing, 2017: 3-19. [43] Halina W. Learner Autonomy: The Role of Educational Materials in Fostering Self-evaluation[A] // Miros?aw P, Anna M W, Jakub B. Autonomy in Second Language Learning: Managing the Resources[M]. Switzerland: Springer International Publishing, 2017: 85-98. [44] Ellis R. The Definition and Measurement of Explicit Knowledge [J]. Language Learning. 2004(54): 229. [45] Ellis R. Measuring Implicit and Explicit Knowledge of a Second Language: A Metric Study[J]. Studies in Second Language Acquisition. 2005(2): 141-172. [46] Hulstijn J. Theoretical and Empirical Issues in the Study of Implicit and Explicit Second-Language Learning: Introduction[J]. Studies in Second Language Acquisition, 2005(27): 129-140. [47] Ellis R. The Study of Second Language Acquisition[M]. Oxford: Oxford University Press, 1994. [48] Dekeyser R M. Implicit and Explicit Learning[A] // Catherine J D, Michael H L. Handbook of Second Language Learning[M]. Oxford: Blackwell, 2003: 31-34. [49] 吕立杰. 教师课程发展理论与实践[M]. 长春:东北师范大学出版社, 2008. [50] Weinbrenner P. Methodologies of Textbooks Analysis used to Date[A] // Bourdilon Hilary (ed.), European Meeting on Educational Research [C]. Amsterdam: Swets & Zeitlinger B.V., 1992: 21-34. [51] 唐磊. 外语教材编制理论初探[J]. 课程.教材.教法, 2000(12): 22-31. [52] 拉尔夫·泰勒. 课程与教学的基本原理[M]. 施良方译. 北京: 人民教育出版社, 1994. [53] 王策三. 教学论稿[M].北京: 人民教育出版社, 2005. [54] 印辉. 在大学英语课堂教学中培养学生的学习自主[J]. 外语与外语教学, 2004(9): 33-36. [55] Lee W. The Role of Materials in the Development of Autonomous Learning[A] // Richard P, Edward S L, Winnie W F, et al. Taking Control: Autonomy in Language Learning[M]. Hong Kong: Hong Kong University Press, 1996: 167-184. [56] McGarry D. Learner Autonomy: The Role of Authentic Texts[M]. Dublin: Authentic, 1995. [57] Little D. Responding Authentically to Authentic Texts: A Problem for Self-access Language Learning[A] // Phil B, Peter V. Autonomy and Independence in Language Learning[M]. London: Longman, 1997: 225-236. [58] 叶文静. 中国大陆与香港高中英语教材之比较研究[D]. 福建师范大学, 2014. [59] 李沂濛. 基于模糊综合评判的英语教材评价研究——以人教版与北师大版高中教材为例[D]. 东北师范大学, 2010. [60] 王朝玲. 两套高中英语教材中语法部分的设计研究[D]. 北京外国语大学, 2013. [61] Andrews K L Z. The Effects of Implicit and Explicit Instruction on Simple and Complex Grammatical Structures for Adult English Language Learners[J]. TESL-EJ, 2014, 11(2):15. [62] 柴源. 显性与隐性教学结合下的大学英语语法教学——以虚拟语气为例[D]. 四川外国语大学, 2014. [63] 林毅君. 基于自适应学习模式的英语从句语法教学研究[D]. 北京大学, 2015. [64] 冯志伟. 自然语言处理简明教程[M]. 上海: 上海外语教育出版社, 2012. [65] 中华人民共和国国家质量监督检验检疫总局, 中国国家标准化管理委员会. GB/T 17532-2005术语工作计算机应用词汇. 北京: 中国标准出版社, 2005. [66] David P A. Educational Psychology: A Cognitive View[M]. New York: Holt McDougal, 1978. [67] Andrews K L Z. The Effect of Implicit and Explicit Instruction on Simple and Complex Grammatical Structures for Adult English Language Learning[D]. Alliant International University, 2007. [68] Sandra J M. Explicit versus Implicit Grammar in the Teaching of the SER/ESTAR Distinction in Spanish[D]. State University of New York, 2002. [69] Martinez M. Adapting for a Personalized Learning Experience[A] // Ronghuai H, Kinshuk, Spector J M. Reshaping Learning. New Frontiers of Educational Research[C]. Heidelberg: Springer, 2013: 139-174. [70] Ellis R. The Dynamics of Second Language Emergence: Cycles of Language Use, Language Change, and Language Acquisition[J]. The Modern Language Journal, 2008, 92(2): 232-249. [71] Rubin J. What the “Good Language Learner” Can Teach Us[J]. TESOL Quarterly, 1975, 9(1): 41-51. [72] 王志伟. 基于机器学习的智能英语教材编纂系统[D]. 北京大学, 2016. [73] 杨玉竹.《卫报》数字化转型探析[J]. 新闻研究导刊, 2016(9): 94. [74] 刘笑盈, 康秋洁. 转型迎战数字化大潮, 没有完成时[N]. 人民日报, 2014, 7(17): 23. [75] 李白坚. 中国新闻学史[M]. 上海: 上海大学出版社. 2004: 285. [76] 彭莹莹, 辜向东, 黄娟. 2016年高考英语全国卷及分省命题卷阅读理解试题命题质量研究——基于Bachman和Palmer考试任务特征理论的试卷评析[J]. 教育测量与评价, 2017(3): 57-64. [77] Nunan D. Designing Tasks for the Communicative Classroom[M]. Cambridge: CUP, 1989: 97-101. [78] John D. How We Think[M]. Boston: Houghton Mifflin, 1933. [79] 全国青少年心理研究协作组. 国内二十三省市在校青少年思惟发展的研究[J]. 心理学报, 1985(3): 286-295. [80] 刘慧娟, 张璟. 高中生不良情绪状态的特点研究[J]. 心理发展与教育, 2002, 18(2): 60-63. ﹀
馆藏号：	017/M2018(213)
公开日期：	2018-05-25

基于支架式理论的技术文档写作教学研究.闫晓宁

链接

题名：	基于支架式理论的技术文档写作教学研究
姓名：	闫晓宁
学号：	1601210816
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-25
外文题名：	Research on Technical Writing Teaching Based on Scaffolding Theory
关键词：	支架式理论技术文档写作技术文档写作教学
外文关键词：	Scaffolding theory Technical writing Technical writing teaching
论文摘要：	︿在过去十几年里，我国在技术文档写作教学方面取得了显著成就，但仍存有许多问题亟待解决：技术文档写作教学方法缺乏实证研究，学生的主体地位未得到充分发挥，教学材料相对固定，评估机制单一等。这些因素导致学生虽掌握了技术文档写作的相关知识，但在开发文档时却依旧困难重重、无从下手。本研究以技术文档写作教学为切入点，基于建构主义理论、最近发展区和社会文化理论的研究成果，结合技术文档写作的独有特点及当前教学存在的问题，提出了基于支架式理论的技术文档写作教学方法。该方法将学习的主导权由教师转向学生，通过有针对性地搭建支架引导学生自主学习，充分发挥学生学习的主体性；摒弃传统的固态化教学，根据课堂反馈、课后练习及测试确定学生的最近发展区，动态地调整学习材料，确保学习难度维持在i+1的水平；综合同伴互评、教师评价及个人自评的多元化评估机制，提高学生的自我认知能力和反思能力。为检验该方法的有效性，本研究在北京大学开展了为期10周的技术文档写作教学实验，以42名计算机辅助翻译方向一年级学生为实验对象。实验组采用本研究提出的基于支架式理论的技术文档写作教学方法，对照组采用传统的技术文档写作教学方法，并在实验结束后对调查问卷、访谈及实验数据展开了评估与分析。研究结果表明，本研究提出的基于支架式理论的技术文档写作教学方法可以有效提高学生的技术文档写作能力。该方法有助于学生培养写前用户分析及产品分析的习惯；有助于学生熟练运用技术文档写作质量规范及写作风格；有助于学生掌握技术写作的文档元素和篇章结构。其中，该方法对技术文档写作简洁性、写作风格、视觉效果及组织架构方面的提升效果显著。同时，调查问卷显示学生对该教学方法持肯定态度。﹀
外文摘要：	︿ China has made remarkable achievements in technical writing teaching in the past ten years, but there are still many problems to be solved urgently: technical writing teaching methods are lack of empirical research, students’ role is restricted, teaching material is relatively fixed and evaluation mechanism is single. Therefore, although students master the rules of technical writing, they still have many difficulties in developing documents. Based on the research of Constructivism, Zone of proximal development and Social culture theory, combined with the unique characteristics of technical writing and the current problems existing in the teaching, this study puts forward the technical writing teaching methods based on scaffolding theory. The method turns the dominant right of learning from the teacher to the student, sets up the scaffolds to guide the students to learn independently and gives full play to the subjectivity of the students. Different from traditional solid-state teaching, this method adjusts the learning materials according to students’ ZDP, which is ascertained by classroom feedback, after class exercises and tests, to keep the learning level suitable. The diversity assessment mechanism, combining peer review, teacher evaluation and self-assessment, can improve students’ self-cognitive and reflective ability. To testify the effectiveness of this method, a ten-week-long teaching experiment is conducted at Peking University, with 42 students of computer-aided translation major as the experiment targets. The experiment group adopts the teaching method proposed in this paper, while the control group adopts the traditional teaching method of technical writing. At the end of the experiment, the questionnaire, interview and experimental data were evaluated and analyzed. The results show that the teaching method of technical writing based on scaffolding theory proposed in this study can effectively improve students’ ability of technical writing. This study argues that the teaching method is helpful to cultivate the habit of user and product analysis before writing, is helpful to master the technical writing style, and is helpful to grasp the components and structure of technical writing, especially to the conciseness, writing style, visual effectiveness and organizational structure of technical writing. What’s more, the questionnaire shows that this method is widely welcomed by students. ﹀
分类号：	H08
论文总页数：	100
参考文献总数：	69
参考文献列表：	︿ [1] 苗菊, 高乾. “构建MTI教育特色——技术协作的理念与内容.” 中国翻译2(2010):35-38. [2] https://en.wikipedia.org/wiki/Technical_writing [3] 王久华.企业技术进步手册[M].北京:科学出版社, 1991(11). [4] 王传英,王丹. 技术写作与职业翻译人才培养[J].解放军外国语学院学报,2011(3):69-73. [5] 何京燕. 基于语料库的技术文档模糊限制语使用研究[D]. 北京大学,2013. [6] Markel, Michael. Technical Communication [M].Boston: Bedford, St Martins, 2012. [7] Cisco System, Inc. Cisco Technical Documentation Style Guide [M]. USA. 2009. [8] Hawley, Todd. Microsoft Manual of Style for Technical Publications [M]. Microsoft Press, 1995. [9] Hayward, et al. The IBM Style Guide: Conventions for Writers and Editors [M]. New York: IBM Press, 2012. [10] Hargis, et al. Developing Quality Technical Information: A handbook for Writers and Editors. Second Edition [M]. New York: IBM Press, 2004. [11] Laura Bellamy, Michelle Carey, Jenifer Schlotfeldt. DITA Best Practice: A Roadmap for Writing, Editing, and Architecting in DITA [M]. New York: IBM Press, 2011. [12] Strevens, P. Special-purpose Language Learning: a perspective [J]. Language Teaching & Linguistics, 1977, 1(3). [13] Hutchinson, T. & Waters, A. English for Specific Purposes: A learning-centered approach [M].Cambridge: Cambridge University Press，1987. [14] Barnum C M etal. Globalizing technical communication: A field report from China [J]. Technical Communication, 2001, (48): 397-420. [15] 段平, 顾维萍. 我国大学ESP教学的发展方向探讨[J]. 外语界, 2006(4):36-40. [16] 郭剑晶. 专门用途英语教学研究[M]. 知识产权出版社, 2012. [17] Schriver, Karen A. Teaching writers to anticipate reader’s needs: A classroom-evaluated pedagogy [J]. Written Communication, 1992, 9(2):179-208. [18] Mary M. Lay. Reflections on Technical Communication Quarterly, 1991-2003: The Manuscript Review Process, Technical Communication Quarterly, 2004, 13(1), 109-119, [19] Kelli Cargile Cook. Layered Literacies: A Theoretical Frame for Technical Communication Pedagogy, Technical Communication Quarterly, 2002, 11(1), 5-29. [20] Savage G. Educating Technical Communication Teachers: The Origins, Development, and Present Status of the Course, “Teaching Technical Writing” at Illinois State University [J]. Communication & Language at Work, 2013, 1(2):3. [21] Elizabeth Tebeaux, Sam Dragga. The Essentials of Technical Communication [M]. Oxford University Press, 2009. [22] Jonathan Lehtonen. Curriculum Guide: Advanced ESL Technical Writing [EB]. Penn State University, 2014. [23] Bushneil, Jack. A contrary view of the technical writing classroom: Notes toward future discussion [J]. 1999. [24] 李雪峰,杜志娟. 高校英语专业学生技术写作能力培养研究[J].兰州教育学院学报,2015,314（4）：132-133. [25] 王育烽,陈智淦. “技术写作课程与高校英语专业复合型人才培养”.东莞理工学院学报2(2013):59-64. [26] 吴燕秋. 基于CATTP平台的Wiki协作式写作和同伴互评研究——以技术文档写作教学为例[D]. 北京大学, 2014. [27] 徐彬彬. 回译在英文技术写作教学中的应用研究[D].北京大学,2017. [28] Wood, D., Bruner, J., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Psychology and Psychiatry. 17, 89-100. [29] Balaban, Nancy. (1995). “Seeing the Child, Knowing the Person.” In Ayers, W. To Become a Teacher. [30] Benson, B. (1997). Scaffolding (Coming to Terms). English Journal, 86(7), 126-127. [31] Cazden, C.B. 1983. Adult Assistance to language Development: Scaffolds, Models, and Direct Instruction. In Parker R.P. & F.A. Davis (eds.), Developing Literacy: Young Children’s Use of Language (p.3-17). Newark, DE: International Reading Association. [32] Anghileri, J. (2002). Scaffolding Practices that enhanced mathematics learning. In Cockburn, A. & Nardi, E. (eds.) Proceeding of the 26 Annual Conference of the International Group of the Psychology of Mathematics Education. Norwich, UK: UEA. [33] 何克抗. 1997. 建构主义的教学模式,教学方法与教学设计. 北京师范大学学报. 1992(5), 74-81. [34] Piaget, J. 1972. The Principles of Genetic Epistemology. New York: Basis Books. [35] 何克抗.建构主义-革新传统教学的理论基础[J].电化教育研究，1997（3）：3-9. [36] Jonassen, D. H. (1999).Constructing learning environments on the web: Engaging students in meaningful learning. EdTech 99: Educational Technology Conference and Exhibition 1999: Thinking Schools, Learning Nation. [37] Vygotsky, L. S. (1978) Mind in Society: The Development of Higher Psychological Processes [M]. Cambridge. MA: Harvard University Press. [38] E. D. Bozhovich. Zone of Proximal Development [J]. Journal of Russian & East European Psychology, 2009, 47(6):48-69. [39] Reber A S. The penguin dictionary of psychology [M]. Вече ;, 2001. [40] Raymond, E. 2000. Cognitive Characteristics Learners with Mild Disabilities [M]. Allyn & Bacon:Needham Heights. [41] 庄志强, 2010. 学习支架建构技能训练[M]. 天津:天津教育出版社,77－78. [42] Lantolf, J. P. Sociocultural Theory and Second Language Learning [M]. Oxford: Oxford University Press, 2000b. [43] Kendon, A. 1990. Conduction Interaction. Cambridge: Cambridge University Press. [44] Bruner, J. 1996. The Culture of Education. USA: Harvard University Press. [45] Bransford, J. Brown, A. &R. Cocking. 2000. How People Learn: Brain, Mind, and Experience &School. Washington, DC: National Academy Press. [46] William, M. & R. Burden. 2000. Psychology for Language Teachers.北京:外语教学与研究出版社. [47] Lantolf, J. P., & Appel, G. (Eds.). (1994). Vygotskyan approaches to second language research [M]. [48] Brush, T. & Saye, J. (2001). The use of embedded scaffolds with hypermedia-supported student-centered learning. Journal of Educational Multimedia and Hypermedia 10(4): 333–356. [49] Van Lier, L. (1996) Interaction in the Language Curriculum: Awareness, Autonomy, and Authenticity. London: Longman. [50] Holton, Derek, and Clark, David (2006). Scaffolding and metacognition. International Journal of Mathematical Education in Science and Technology, 37, 127–143. [51] Yelland, Nicola, and Masters, Jennifer (2007). Rethinking scaffolding in the information age. Computers and Education, 48, 362–382. [52] 何克抗,建构主义——革新传统教学的理论基础[J],教育研究. 1999(5), 53-59. [53] Graves, M., & Braaten, S. (1996). Scaffolding reading experiences for inclusive classes. Educational Leadership, 53(5), 14–16. [54] Lauren, R. 1983. Mathematics and Science Learning: A New Conception. Science, 1983(4), 29. [55] Richards, J. & T. Rodgers. 1986. Approaches and Methods in Language Teaching. Cambridge: Cambridge University Press. [56] Rosenshine, B. & Meister, C. (1992). The use of scaffolds for teaching higher-level cognitive strategies. Educational Leadership, 49(7), 26–33. [57] Mercer, N. & Fisher, E. How Do Teachers Help Children to Learn [M]. London: Routledge, 1998. [58] Hartman, H. 2002. Scaffolding & Cooperative Learning. Human Learning and Instruction [M]. New York: City College of City University of New York. Education Company. [59] Leanne, T.T. Jan, R.K. & M.G. Tindal. 2003. Effects of Concept-based Instruction on an English Language Learner in a Rural School: A Descriptive Case Study. Bilingual Research Journal, 27(2), 259-274. [60] Van de Pol, Janneke, Volman, Monique, & Beishuizen, Jos. (2010). Scaffolding in Teacher–Student Interaction: A Decade of Research. Educational Psychology Review, 22:271–296 [61] Jumaat, Nurul, Farhana & Zaidatun, Tasir (2014). Instructional Scaffolding in Online Learning Environment: A Meta-Analysis. Presented at the 2014 International Conference on Teaching and Learning in Computing and Engineering. doi: 10.1109/LaTiCE.2014.22 [62] 吴锦鹏. 2003. 浅谈支架式教学. 学科教育. 2003 (6), 29-31. [63] 王淑华. 2004. 建构主义理论在高中英语写作教学中的运用. 湖北: 华中师范大学. [64] 张国荣.2004. “支架”理论的英语写作教学中的应用[J],外语与外语教学.2004(9), 37-39. [65] 朱枫. 2005. 大学英语支架式教学实证研究. 哈尔滨学院学报. 2005(1), 129-132. [66] Brush, T. & Saye, J. (2001). The use of embedded scaffolds with hypermedia-supported student-centered learning. Journal of Educational Multimedia and Hypermedia 10(4): 333–356. [67] Van de Pol, Janneke, Volman, Monique, & Beishuizen, Jos. (2010). Scaffolding in Teacher–Student Interaction: A Decade of Research. Educational Psychology Review, 22:271–296. [68] Tharp, R. G., & Gallimore, R. (1988). Rousing minds to life: Teaching, learning, and schooling in social context. Cambridge: Cambridge University Press. [69] Smith-Worthington, Darlene; Jefferson, Sue. Technical Writing for Success (3rd Edition). Mason, OH: South-Western, Cengage Learning, 2011. ﹀
馆藏号：	017/M2018(321)
公开日期：	2018-05-25

中式英语的自动检测研究与应用.于婵

链接

题名：	中式英语的自动检测研究与应用
姓名：	于婵
学号：	1501210769
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-25
外文题名：	Research and Application of Automatic Detection of Chinglish
关键词：	中式英语错误识别错误搭配反馈
外文关键词：	Chinglish error detecting collocation error feedback
论文摘要：	︿目前中国学生的英语写作中，中式英语现象普遍存在，影响了书面表达的准确性和地道性。为中国英语学习者明确指出中式英语出现的位置、中式英语类型以及提供有效的校正反馈，可以提高学生对中式英语的敏感度，从而减少错误的发生。本文旨在通过基于规则的方式对学生英语写作中出现的中式英语进行识别检测，并给予其反馈，最终目的是提高中国英语学习者英语写作的准确性和地道性以及英语教学的有效性。已有关于中式英语的研究多集中于产生原因、案例分析和教学建议的探索，鲜有从可借助于自然语言处理技术进行自动发现和识别的角度来对其进行研究的。本文中，笔者历经三个阶段来对中式英语的有效识别检测进行了探索：第一阶段是对书籍和论文中的中式英语案例从可自动发现和识别的角度进行了归类、总结和分析，并借助Language Tool将归类后的案例使用XML语言书写规则的形式对其进行发现；且对已有规则库采用系统化测试的方法进行评估和改进，有效提高了该规则库对中式英语句子的识别准确率，降低了对正确句子的误报率；第二阶段是使用Lucene工具包的“倒排索引”方式和Stanford coreNLP自然语言处理工具对之前易错搭配研究的成果进行了充分且完备的高效利用，且分析了之前研究的不足以及笔者利用时出现的问题，并对其进行了改进，将该成果运用到写作中搭配类的中式英语识别检测中；第三阶段中，为获取更多中式英语的规则模式，笔者对既有的大规模有错误标注的学习者语料库——CLEC语料库和NUCLE语料库提出了获取中式英语识别规则的方法，即利用标注范围、句法分析限定来自动生成规则，并进行了实现；笔者根据第三阶段的探索和研究，最终开发出了一款集半自动中式英语规则生成和学生英语习作中的中式英语自动检测为一体的工具，可以保证较高的识别准确率和较低的误报率，极大地节省了手工书写规则的人力和时间。使用该工具能自动生成规则、人工修改规则、使用正确语料和错误语料验证规则有效性，产出不断丰富的规则库，利用该规则库中已经确认合理有效的识别规则可以有效识别检测中国学生的英语写作中的中式英语。本文研究的成果是通过科学有效的方式获得了一批识别准确率较高、误报率较低的中式英语类型的规则，实现了已有中式英语易错搭配规则表的高效利用，从既有的学习者语料库获取了大批量的搭配类中式英语的规则模式，且设计并提出了一套系统化流程，在Eclipse平台上进行了各子功能的集成和实现，可以为规则编辑人员书写中式英语识别规则提供便利，促进中国英语学习者的写作水平的提高，并且有力地减轻了英语教师批改作文中重复出现的中式英语错误的工作量。﹀
外文摘要：	︿ The phenomenon of Chinglish is common in the writing of Chinese students nowadays, which affects the accuracy and idiomaticity of the written expression. Clearly pointing out the type of Chinglish, and providing effective correction feedback for Chinese English learners can improve students' sensitivity to Chinglish and reduce errors. The purpose of this paper is to identify and detect the Chinglish in the English writing of the students through a rule-based approach and give them feedback. The ultimate goal is to improve the accuracy and idiomaticity of Chinese English learners' English writing, as well as the effectiveness of English teaching. Many studies on Chinglish are focused on the causes, case analyses and teaching suggestions, but few are conducted from the perspective of automatic discovery and recognition through the natural language processing techniques. In this article, the author explores the effective recognition and detection of Chinglish in three stages. The first stage is to classify, summarize and analyze Chinglish cases in books and papers from the perspective of automatic discovery and recognition. Language Tool is used to summarize the cases in the form of rules written in the XML language. The author tests, evaluates and improves the existing rule base via systematic testing methods, effectively improving the accuracy of the rule base. The second stage is to make full use of the results of previous error-prone research through the "inverted index" approach of Lucene Toolkit and the Stanford coreNLP natural language processing tool. The author also analyzes the deficiencies of the previous studies and some mistakes. Then after some corrections, the author applies the results to the Chinglish detection of the collocation type in the writing. In the third stage, in order to obtain more Chinglish rules and patterns, the author also uses large-scale marked learner corpora-the CLEC corpus and the NUCLE corpus to propose a method for acquiring Chinglish recognition rules, that is, the rules can be automatically generated via the scope of labeling and syntactic analysis. The author finally develops a Chinglish detection system based on the third stage of research. The system, which combines semi-automatic Chinglish rule generation with Chinglish auto-detection in students' English writings, can ensure high recognition accuracy and low false alarm rate and it greatly saves labor cost and time for hand-written rules. This tool helps automatically generate rules and manually modify rules. Users can use the BNC corpus and CLEC corpus to verify the validity of rules, generate a constantly enriched rule base, and use the recognition rules that have been confirmed valid to detect Chinglish in English Writing. The achievement of this study is to obtain a batch of Chinglish rules with high recognition accuracy and low false alarm rate in a scientific and effective way. The author has made full use of the existing Chinglish error-prone collocation rule tables and acquires a large number of regular patterns of collocational Chinglish from the marked learner corpus. A systematic process is designed and proposed. The integration and implementation of sub-functions are performed on the Eclipse platform. The research can promote the Chinese English learners' writing skills, and effectively reduce the English teachers’workload of correcting repetitive Chinglish errors in the English writing. ﹀
分类号：	H08
论文总页数：	70
参考文献总数：	41
参考文献列表：	︿ [1] Ellis R. The study of second language acquisition[M]. Oxford University, 1994. [2] Ashwell T. Patterns of Teacher Response to Student Writing in a Multiple-Draft Composition Classroom: Is Content Feedback Followed by Form Feedback the Best Method?[J]. Journal of Second Language Writing, 2000, 9(3):227-257. [3] 琼?平卡姆, 平卡姆, Pinkham. 中式英语之鉴[M]. 外语教学与研究出版社, 2000. [4] 庄绎传. 也谈中式英语[J]. 中国翻译, 2000(6):8-11. [5] 葛传椝. 漫谈由汉译英问题[J]. 中国翻译, 1980(2):1-8. [6] 汪榕培. 英语词汇学研究[J]. 2000. [7] 李文中. 中国英语与中国式英语[J]. 外语教学与研究, 1993(4):18-24. [8] 方梦之. 译学辞典[M]. 上海外语教育出版社, 2004. [9] 张继矿. 挑战中式英语[M]. 四川大学出版社, 2005. [10] 罗嘉璐. 翻译教学中中式英语及其应用研究[J]. 海外英语, 2015(4):53-54. [11] 李丽华. 基于习作中的“中式英语”探讨高中英语写作的有效教学策略[J]. 英语教师, 2015, 15(10):136-139. [12] 乔令先. 模因论视角下英语写作中的中式英语现象分析[J]. 教育评论, 2016(12):126-129. [13] 肖启芬. 试论中国式英语的产生根源及应对措施[J]. 教师教育论坛, 2004, 17(3):37-39. [14] A. B. Linguistics across Cultures. Applied Linguistics for Language Teachers by Robert Lado[M]. University of Michigan Press, 1957. [15] 戴炜栋, 束定芳. 外语交际中的交际策略研究及其理论意义──外语教学理论研究之三[J]. 外国语, 1994(6):27-31. [16] 连淑能. 英汉对比研究.增订本[M]. 高等教育出版社, 2010. [17] Selinker L. Interlanguage.[J]. IRAL - International Review of Applied Linguistics in Language Teaching, 1972, 10(1-4):209-232. [18] Corder S P. The significance of learner's errors.[J]. IRAL - International Review of Applied Linguistics in Language Teaching, 1967, 5(1-4):161-170. [19] 麻秀丽. “错误提示”英语写作教学法研究[J]. 中国教育学刊, 2013(s2):57-58. [20] Widdowson H. J.R. Firth, 1957, Papers in Linguistics 1934–51[J]. International Journal of Applied Linguistics, 2010, 17(3):402-413. [21] Kjellmer G. A mint of phrases[M]. na, 1991:116. [22] MortonBenson. The BBI combinatory dictionary of English[M]. 1986. [23] Renouf A, Sinclair J M. Collocational frameworks in English[J]. 1991. [24] C.C. Shei, Helen Pain. An ESL Writer's Collocational Aid[J]. Computer Assisted Language Learning, 2000, 13(2):167-182. [25] 林燕. 基于n-gram的英语文章的自动检查[J]. 信息化建设, 2016(6). [26] Pantel P, Lin D. An Unsupervised Approach to Prepositional Phrase Attachment using Contextually Similar Words[J]. Acl Proceedings of Annual Meeting on Association for Computational Linguistics, 2000. [27] Yu-Chia Chang, Jason S. Chang, Hao-Jan Chen, et al. An automatic collocation writing assistant for Taiwanese EFL learners: A case of corpus-based NLP technology[J]. Computer Assisted Language Learning, 2008, 21(3):283-299. [28] Wible D, Kuo C H, Tsao N L, et al. Bootstrapping in a language learning environment[J]. Journal of Computer Assisted Learning, 2003, 19(1):90–102. [29] 张仰森, 俞士汶. 文本自动校对技术研究综述[J]. 计算机应用研究, 2006, 23(6):8-12. [30] 杜金榜. 从学生英语写作错误看写作教学[J]. 外语教学, 2001, 22(2):43-47. [31] 张华. 对中国学生搭配错误根源的分析[M]. 北京:北京大学英语系, 1993. [32] 王颖. 反馈与英语写作[M]. 山东大学出版社, 2007. [33] 朱晔, 王敏. 二语写作中的反馈研究:形式、明晰度及具体效果[J]. 现代外语, 2005, 28(2):170-180. [34] Chandler J. The efficacy of various kinds of error feedback for improvement in the accuracy and fluency of L2 student writing[J]. Journal of Second Language Writing, 2003, 12(3):267-296. [35] 吉乐. 英语写作反馈模式的效能评估研究[D]. 上海外国语大学, 2010. [36] Ferris, D. R. & Helt, M. (2000). Was Truscott right? New evidence on the effects of error correction in L2 writing classes. Paper presented at the AAAL Conference, Vancouver, B.C. [37] 杨惠中, 桂诗春, 杨达复. 基于CLEC语料库的中国学习者英语分析[M]. 上海外语教育出版社, 2005. [38] Bodon F. A fast APRIORI implementation[J]. Proceedings of the IEEE Icdm Workshop on Frequent Itemset Mining Implementations, 2003. [39] 袁鼎荣, 李波. 频繁项集挖掘技术述评[J]. 广西民族大学学报(自然科学版), 2005, 11(1):86-90. [40] 吴学雁, 莫赞. 基于Aproiri算法的频繁项集挖掘优化方法[J]. 计算机系统应用, 2014, 23(6):124-129. [41] 宋柔. 计算机辅助汉语校对系统[J]. 当代语言学, 2001, 3(1):45-54. ﹀
馆藏号：	017/M2018(326)
公开日期：	2018-05-25

Keystroke logging 评估的技术写作和术语教学研究.钟梦俐

链接

题名：	Keystroke logging 评估的技术写作和术语教学研究
姓名：	钟梦俐
学号：	1501210817
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-25
外文题名：	Research on Evaluating Technical Writing and Terminology Teaching using Keystroke Logging
关键词：	技术写作术语教学按键记录
外文关键词：	Technical writing Terminology teaching Keystroke logging
论文摘要：	︿近几年来技术写作课程相应得到了学术界的讨论，然而对其教学方法的实证研究则相对较少。当下国内技术写作教学存在的问题表现在：忽略了对技术写作具有重要意义的术语的教学内容；缺乏对学生术语问题的探究，教学过程忽略联系实际；学生对术语把握不准确，无法深入理解概念，难以熟练掌控写作过程和技巧。本研究以术语问题为切入点，基于写作认知相关研究成果，为从认知角度探究术语学习问题，借助认知心理研究领域的按键记录工具（Keystroke logging）来创新技术写作中的术语教学方法。Keystroke logging是认知领域常用的研究工具，能够如实地记录写作过程数据和相关信息以供研究者分析，也有研究者将之用于写作结果的评估。本研究首先通过Keystroke logging术语认知实验，探究学生在术语产出上的障碍，进而分析得出结论如下：学生对术语概念内涵和逻辑原理把握不清；对术语的敏感和辨析程度较弱；在写作中对术语的具体语境运用的效率不高。据此提出技术写作的术语情景教学方法。教学设计主要包括三个部分：产品和术语分析、领域知识学习和术语练习、认知结果评估。产品和术语分析主要是将术语融入产品或操作过程中，通过产品或操作过程剖析术语。领域知识通过相关篇章和关键词还原术语的出处，二者从写作认知的知识转移层面强化术语的内涵和语用。术语练习从搭配上掌握术语使用，认知结果评估通过设计Keystroke logging指标对教学的写作结果进行分析，帮助更好地观察和研究学生的英文技术写作情况，给予教学活动更深入的反馈。通过实施为期4周的教学实验，本研究设置实验组和对照组以验证教学方法的有效性。前者采用术语情景教学方法进行学习；后者采用直接术语参照表的方式进行学习，最后两组在Keystroke logging工具下进行写作测试。通过对比分析Keystroke logging与认知相关的指标数据变化，研究结果表明：本研究提出的术语教学方法有助于学生分析并理解术语，对技术文档简洁性的改善具有显著效果。同时Keystroke logging对技术文档写作教学具有积极影响，扩宽了其在写作认知领域的进一步研究与应用。﹀
外文摘要：	︿ In recent years, technical writing courses have been discussed in academic circle, however, the number of empirical studies on their teaching methods is relatvely low. The problems in the current domestic technical writing teaching are shown in the following aspects: ignoring the teaching content of terminology that is important to technical writing ; lack of inquiry into student terminology issues and ignoreing the actual contact in teaching process ; students are not sensitive to the term, unable to understand concept, and difficult to master the writing process and skills. Based on the research results of writing cognition, this study takes the term problem as a starting point, to explore the term learning problem from the cognitive perspective, and use keystroke logging in the field of cognitive psychology to innovate the terminology teaching method . Keystroke logging is a commonly used research tool in the cognitive domain, which is capable of documenting the writing process data and related information for researchers' analysis, as well as an assessment of writing results . Through the Keystroke logging terminology cognitive experiment, this study first explored the barriers to students's terminology output, and then analyzed the reasons as following: students could not grasp the connotation and logic principle of the term concept; their sensitivity and differentiation to terms is weak, and term using in specific context is ineffective while writing. On the basis of the above analysis, we put forword the terminology situational teaching method of technical writing. The teaching method mainly includes three parts: products and terms analysis, domain knowledge learning and term practice, cognition result evaluation. Product and term analysis is mainly about intergrating the term into a product or operation, and analyzing the terminology through a product or process. Domain knowledge, through related chapters and keywords, is used to trace the source of terms. Both of the above two processes reinforce the connotation and pragmatic use of terms from the knowledge transfer level of writing cognition. The term practice is to learn term collocation, and the cognitive result assessment analyzes the writing results of teaching by designing Keystroke logging indicators, helps to better observe students' English technical writing, and gives more in-depth feedback to teaching activities. Through the implementation of a 4-week teaching experiment, the study sets up experimental group and control group to verify the effectiveness of the teaching method. The former adopts the term situational teaching method ; the latter used the direct term reference table, and then the two groups take tests under the keystroke logging tool. Through the comparative analysis of changes in keystroke logging and cognitive-related indicators, the results show that the method proposed in this study is feasible and effective for English technical writing teaching, which can help students analyze and understand the terminology, grasp the product information content and technical principle, and significantly improve simplicity of technical documentation. Meanwhile Keystroke logging has a positive impact on the teaching of technical writing, and has broadened its further research and application in the field of writing cognition. ﹀
分类号：	H087
论文总页数：	114
参考文献总数：	66
参考文献列表：	︿ [1] Khorechko U V, Scherbinin A I, Lebedeva I O, et al. Basic Methodical Grounds of Teaching International Students Chinese Field-specific Terms in Technical Institutes of Higher Education (in the Context of Polymer Production) [J]. Procedia - Social and Behavioral Sciences, 2015, 215:43-52. [2] Maylath B. Writing Globally: Teaching the Technical Writing Student to Prepare Documents for Translation.[J]. Journal of Business & Technical Communication, 1997, 11(3):339-352. [3] Miller K S, Lindgren E, Sullivan K P H. The Psycholinguistic Dimension in Second Language Writing: Opportunities far Research and Pedagogy Using Computer Keystroke Logging[J]. Tesol Quarterly, 2008, 42(3):433–454. [4] Ellis R, Yuan F. The Effects Of Planing On Fluency, Complexity, And Accuracy In, Second Language Narrative Writing[J]. Studies in Second Language Acquisition, 2004, 26(1):59-84. [5] Chenoweth N A, Hayes J R. Fluency in Writing: Generating Text in L1 and L2.[J]. Written Communication, 2001, 18(1):80-98. [6] Waes L V, Weijen D V, Leijten M. Learning to Write in an Online Writing Center: The Effect of Learning Styles on the Writing Process[J]. Computers & Education, 2014, 73(1):60-71. [7] Smet M J R D, Brand-Gruwel S, Leijten M, et al. Electronic Outlining as a Writing Strategy: Effects on Students' Writing Products, Mental Effort and Writing Process[J]. Computers & Education, 2014, 78:352-366. [8] VW Berninger, HL Swanson. Modifying Hayes and Flower's Model of Skilled Writing to Explain Beginning and Developing Writing[J]. Advances in Cognition and Educational Practice,1994,2:57-81. [9] Rui A A. Cognitive Processes in Writing During Pause and Execution Periods[J]. European Journal of Cognitive Psychology, 2009, 21(5):758-785. [10] Scardamalia M, Bereiter C. Knowledge Telling and Knowledge Transforming in Written Composition.[M] Advances in Applied Psycholinguistics: Reading, Writing, and Language Learning. 1997. [11] Weijen D V, Bergh H V D, Rijlaarsdam G, et al. Differences in Process and Process-Product Relations in L2 Writing[J]. International Journal of Applied Linguistics, 2008, 156:203-226. [12] Russell Almond , Deane P, Thomas Quinlan , et al. A Preliminary Analysis Of Keystroke Log Data From A Timed Writing Task[J]. Ets Research Report, 2012, 2012(2):i–61. [13] Hoste V, Waes L V, Macken L, et al. From Character to Word Level: Enabling the Linguistic Analyses of Inputlog Process Data[C] The Workshop on Computational Linguistics and Writing. Association for Computational Linguistics, 2012:1-8. [14] Yan C M W, Mcbride-Chang C, Wagner R K, et al. Writing Quality in Chinese Children: Speed and Fluency Matter[J]. Reading & Writing, 2012, 25(7):1499-1521. [15] Zhang M, Deane P. Process Features in Writing: Internal Structure and Incremental Value Over Product Features[J]. Ets Research Report, 2015, 2015(2):1-12. [16] Chandler J. The Efficacy of Various Kinds of Error Feedback for Improvement in the Accuracy and Fluency of L2 Student Writing[J]. Journal of Second Language Writing, 2003, 12(3):267-296. [17] Waes L V, Leijten M. Inputlog: A Multimethod Approach Describing: Cognitive Writing Processes Using Keystroke Logging[J]. Internaltional Report,2012,50(2):18-35. [18] Leijten M, Van Waes L. Keystroke Logging in Writing Research: Using Inputlog to Analyze and Visualize Writing Processes[J]. Written Communication, 2013, 30(3):358-392. [19] Miller K S S. Pausing, Productivity and the Processing of Topic in Online Writing[J]. Elsevier Science, 2006. [20] Baaijen V M, Galbraith D, De Glopper K. Keystroke Analysis: Reflections on Procedures and Measures[J]. Written Communication, 2012, 29(3):246-277. [21] Segalowitz N. Cognitive Bases of Second Language Fluency[J]. Taylor & Francis Ltd, 2010. [22] Rasinski,T.Fluency Is Fundamental[J]. Instructor,2003,113(4),16-20. [23] Bruton D L, Kirby D R. Written Fluency: Didn't We Do that Last Year?[J]. English Journal, 1987, 76:89-92. [24] Hester J L. Investigating Writing Fluency in Seventh and Eighth Graders' Narrative and Expository First Drafts [J]. 2001. [25] Housen, Alex Kuiken, Folkert Vedder, Ineke. Complexity, Accuracy and Fluency: Definitions, Measurement and Research[J]. Language Learning & Language Teaching, 2012, 32:1-20. [26] Cummings K D, Petscher Y. The Fluency Construct[M]. Springer New York, 2016. [27] Kowal I. Fluency in Second Language Writing: A Developmental Perspective[J]. 2014:18. [28] Kumpulainen M. On the Operationalisation of 'Pauses' in Translation Process Research[J]. Translation & Interpreting, 2015, 7(1):6. [29] Kaufer D. Arumentative Writing in Assessment and Instruction: A Comparative Perspective[J]. Genre in Language, Discourse and Cognition, 2016,5:167-192. [30] Bruggen J A V. Factors Affecting Regularity of the Flow of Words during Written Composition[J]. Journal of Experimental Education, 1946, 15(2):133-155. [31] Latif M M A. Toward a New Process-Based Indicator for Measuring Writing Fluency: Evidence from L2 Writers' Think-Aloud Protocols[J]. Canadian Modern Language Review, 2009, 65(4):531-558. [32] 王佳. 技术传播中的翻译——技术写作在专业汉英笔译实践中的应用初探[D]. 上海外国语大学, 2010. [33] Savage G. Educating Technical Communication Teachers: The Origins, Development, and Present Status of the Course, 'Teaching Technical Writing' at Illinois State University[J]. Communication & Language at Work, 2013, 1(2):3. [34] 王传英, 王丹. 技术写作与职业翻译人才培养[J]. 解放军外国语学院学报, 2011, 34(2):69-73. [35] 李梅. 技术传播性质课程的设计与实现探索——以同济大学实用英语写作课为例[J]. 上海理工大学学报(社会科学版), 2017, 39(2):101-107. [36] 徐奇智. 论新媒体时代的技术传播学教学目标--以中国科技大学《技术传播学概论》为例[C] 中国科协年会. 2015. [37] Weigle S C. Integrating Reading and Writing in a Competency Test for Non-native Speakers of English[J]. Assessing Writing, 2004, 9(1):27-55. [38] ?pela Me?ek, Pecorari D, Shaw P, et al. Learning Subject-Specific L2 Terminology: The Effect of Medium and Order of Exposure[J]. English for Specific Purposes, 2015, 38:57-69. [39] 王传英, 王斌, 张雅雯. 技术写作规范研究[J]. 上海翻译, 2016(2):64-70. [40] Pearson J. Terms in Context[J]. J Benjamins, 1998. [41] 马志斌. 特定领域术语自动抽取方法的研究[D]. 哈尔滨工业大学, 2009. [42] 陈雪. 论术语的篇章分析[J]. 术语学研究新进展, 2017, 3(5):108-116. [43] 冯志伟. 现代术语学引论[M]. 商务印书馆, 2011. [44] 梁爱林, 邓愉联. 谈国外大学的术语学教学[J]. 中国科技术语, 2007, 9(6):5-10. [45] 张鹏. 西班牙大学术语教学——以巴勃罗·德·奥拉维德大学为例[J]. 中国科技术语, 2011, 13(4):17-21. [46] 王兰忠. 基于键盘记录和眼动仪的中文二语写作过程研究[J]. 外语电化教学, 2016(2):35-39. [47] Waes L V, Leijten M. Fluency in Writing: A Multidimensional Perspective on Writing Fluency Applied to L1 and L2[J]. Computers & Composition, 2015, 38:79-95. [48] 王昕叶. 基于语境理论的中学英语词汇教学案例分析[D]. 南京师范大学, 2014. [49] 李昕玥. 面向多语言服务平台的术语管理研究[D]. 北京大学, 2010. [50] Steven M.Gerson, A Teacher's Guide To Technical Writing[M]. Kansas Curriculum Center,Washburn University,2015. [51] 张凤. 本地化用户文档系统功能分析及其翻译[D]. 北京大学, 2011. [52] 王少爽. 译者术语能力探索[D]. 南开大学, 2012. [53] Jonathan Lehtonen. Curriculum Guide: Advanced ESL Technical Writing[EB] (2014-8-9) [2017-12-26] http://docplayer.net/35363276-Curriculum-guide-advanced-esl-technical-writing.html. [54] 陈雪. 认知术语学核心术语研究[D]. 黑龙江大学, 2014. [55] Pueyo I G, Val S. The Construction of Technicality in the Field of Plastics: A Functional Approach Towards Teaching Technical Terminology[J]. English for Specific Purposes, 1996, 15(4):251-278. [56] 王蕾. 认知视角下的应用术语研究[J]. 成都理工大学学报(社会科学版), 2015(4):92-96. [57] 郑述谱. 术语学论集[M]. 商务印书馆, 2014. [58] 李海斌. 认知术语学:术语学研究的新方向[J]. 外语学刊, 2014(3):149-154. [59] Wengelin A. The Word-Level Focus in Text Production by Adults With Reading and Writing Difficulties[J]. Writing & Cognition Research & Applications, 2007, 20(2007):68-82. [60] 代碧薇. 基于wiki小组协作式翻译教学[D]. 北京大学, 2017. [61] Stevenson M, Schoonen R, Glopper K D. Revising in Two Languages: A Multi-dimensional Comparison of Online Writing Revisions in L1 and FL [J]. Journal of Second Language Writing, 2006, 15(3):201-233. [62] Bra-Saw C T, Aired G J, Oliu W E. Handbook of Technical Writing[J]. Proceedings of the IEEE, 2012, 66(5):612-613. [63] 梁爱林. 从术语的属性看中国的术语学教育[J]. 中国科技术语, 2010, 12(4):32-36. [64] Corporation M. Microsoft Manual of Style for Technical Publications[M]. Microsoft Press, 1995. [65] Tebeaux E, Dragga S. The Essentials of Technical Communication[J]. Oxford University Home Page, 2012. [66] Andre R, Marije,M, MinJin L. An Exploration Of The Relationships Between Writing Behaviors And Text Quality Using Keystroke Logging,Eye-tracking And Text Analysis[C].2015. ﹀
馆藏号：	017/M2018(356)
公开日期：	2018-05-25

服务于中小学教师的在线研修系统的设计与实现.李贺

链接

题名：	服务于中小学教师的在线研修系统的设计与实现
姓名：	李贺
学号：	1501210581
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-25
外文题名：	The Design and Implementation of Online Research and Training System for Teachers in Primary and Secondary Schools
关键词：	教师研修在线培训集体备课
外文关键词：	Teacher’s Training and Research Online Training Collective lesson preparation
论文摘要：	︿教师研修是指通过整合教学研究、培训和自修等活动以实现教师专业发展的一种教研手段，是提升中小学教师专业化水平的重要途径。传统条件下集中研修活动的开展受时间、空间等条件的制约较多，而现有研修平台在功能、资源以至理念等方面均存在诸多不足，难以发挥实际作用。要有效提升教师在线研修的质量，需建设更加科学有效的在线研修系统。在此背景下，本文以“混合式研训一体化环境构建”为目标，提出服务于中小学一线教师的在线研修系统的设计思路，本设计旨在打破线上线下相互隔阂的状态，使得线上研修真正服务于改善教师教学。本文首先回顾了教师培训和教研备课的现状，以调查问卷和访谈的形式分析了教师在线研修需求及研修场景，同时对现有同类产品对比分析，在此基础上提出系统用例、功能架构和评价指标。不同于以往以项目为中心的在线研修组织形式，本文提出以校为本、线上线下互相配合的设计思路。根据使用场景的不同，本文将研修系统设计为在线培训和在线教研备课两个子系统，二者通过单点登录的方式有机结合。在培训系统中，本文针对课程管理粗放、培训内容针对性差、评价方式不合理、学习进度不明等问题提出解决思路和设计方案；在教研备课系统中，本文创新地提出用于记录备课全过程的“教学全案”设计理念，同时优化了分享协作的功能，为教师提供协作化的在线集体教研备课环境。为了提高开发效率，本文确定了在开源学习管理系统的基础上，根据设计目标进行二次开发的思路。本文在综合对比多款学习管理系统的基础上，从多种设计方案中选择最优方案进行功能开发。在培训系统中，重点实现了在线选课、观课和评价等功能；在教研备课系统中，重点实现了在线教案的制作、分享以及集体教研等功能。最后，本文以编写测试用例、邀请教师试用等方式，对该系统展开测试，并通过用户需求的在线实现效果检查、用户访谈等形式验证了本文所提出系统设计的合理性。﹀
外文摘要：	︿ With the integration of research, training and self-education, teacher’s training and research activity is an important way to meet professional development goals. Under traditional conditions, however, it was restricted by time and space and cannot satisfy the needs of teachers. In order to solve this problem, online training platforms appeared and played a significant role, but currently many deficiencies exist in these platforms in terms of functions, resources and ideas. In this context, aiming to provide teachers of primary and secondary schools with a collaborative online teaching and research environment, this paper puts forward the design of an online training and research system. This paper consists of seven chapters. It first reviews the status and related theories of teacher training and research. After that, it analyzes the needs of online training and lesson preparation from the teacher's point of view. Teacher’s research behavior and existing platforms are then analyzed to help design proper functions. The system is designed as a school-based system and is composed of two main subsystems: online training system and online education research system. This paper introduces the design of major functions of these two subsystems and illustrates the innovation of this design. In the online training system, it is designed to solve problems such as improper curriculum and evaluation method, unable to visualize learning progress and so on. In the other system, this paper innovatively puts forward the design of “full teaching case” and optimizes the sharing and collaboration function. At the implementation stage, this paper compares multiple open source learning management system and choses the most suitable ones. Afterwards, this paper carries out secondary development according to requirements. Lastly, this paper conducts a series of tests to verify the effectiveness of the design, and improves the function according to the feedback of the teachers. The summary and limitations of the research are given in the last chapter. Hopefully this study could bring convenience to primary and secondary school teachers. ﹀
分类号：	G43
论文总页数：	81
参考文献总数：	64
参考文献列表：	︿ [1] 上海市人民政府关于深入推进本市义务教育城乡一体化改革促进优质均衡发展的实施意见[EB/OL].(2018-02-11) [2018-03-15]. http://www.moe.gov.cn/jyb_xwfb/xw_zt/moe_357/jyzt_2016nztzl/ztzl_xyncs/ztzl_xy_dfjz/201803/t20180330_331927.html [2] 李新翠. 中小学教师工作量的超负荷与有效调适[J]. 中国教育学刊, 2016(2):56-60. [3] 第41次《中国互联网络发展状况统计报告》[EB/OL].[2018-03-15]. http://cnnic.cn/hlwfzyj/hlwxzbg/hlwtjbg/201801/P020180131509544165973.pdf [4] 教育部关于印发《教育信息化十年发展规划（2011-2020年）》的通知[EB/OL].(2012-03-13) [2018-03-15]. http://old.moe.gov.cn//publicfiles/business/htmlfiles/moe/s3342/201203/xxgk_133322.html [5] 曾琦, 杜蕾. 参与式教师培训效果的评价研究[J]. 教师教育研究, 2007(4):51-54. [6] 王建德. 以课例研究为载体构建教、研、训一体的教师专业发展培训模式[J]. 中小学教师培训, 2012(9):14-17. [7] 尹小敏. 研究性学习教师远程培训模式探析[J]. 研究生教育研究, 2009(2):31-34. [8] 张二庆, 王秀红. 我国教师培训中存在的主要问题及其分析——以“国培计划”为例[J]. 湖南师范大学教育科学学报, 2012, 11(4):36-39. [9] 王姣姣. 教师培训课程研究的新视角——以11份“国培计划”课程方案为例[J]. 教育理论与实践, 2015(14). [10] 陈向明, 王志明. 义务教育阶段教师培训调查:现状、问题与建议[J]. 开放教育研究, 2013(4):11-19. [11] 毕超. 开展中小学教师远程教育培训的指导思想与原则[J]. 北京教育学院学报, 2004, 18(1):45-48. [12] 武丽志, 李立君. 培训、学习与发展--教师远程培训平台的际代研究[J]. 中国电化教育, 2014(11):74-79. [13] 在线听课15分钟务必动鼠标高校教师“忙”应对[EB/OL].(2011-12-13)[2018-03-15]. http://www.eol.cn/guangdong/gdjs/gdjszx/201112/t20111213_718721.shtml [14] 关松林. 发达国家中小学教师培训的经验与启示——以美国、英国、日本为例[J]. 教育研究, 2015(12):124-128. [15] Bullough R V J, Others A. Long-Term PDS Development in Research Universities and the Clinicalization of Teacher Education.[J]. Journal of Teacher Education, 1997, 48(2):85-95. [16] Darlinghammond L. Research on Teaching and Teacher Education and Its Influences on Policy and Practice.[J]. Educational Researcher, 2016, 45(2):83-91. [17] Kukulska-Hulme A, Sharples M, Milrad M, et al. Innovation in Mobile Learning: A European Perspective[J]. International Journal of Mobile & Blended Learning, 2006, 1(1):13-35. [18] 涂怀京. 新中国中小学教师法规研究(1949—2000) [博士学位论文]. 华东师范大学, 2003. [19] 董绍才. 基础教育教研室制度创新研究[博士学位论文]. 华东师范大学, 2009. [20] 胡继飞, 郑燕花. 说课,值得倡导的教研活动形式[J]. 继续教育研究, 2002(4):87-89. [21] 张诚, 蒲大勇. 校本教研模式与教师专业发展[M]. 成都：四川科学技术出版社, 2008. [22] 谢忠新. 信息技术环境下的校本研修与教师知识管理[J]. 中国教育信息化, 2007(22):7-10. [23] 马倩. 面向区域教研的微信公众平台设计与实践研究[硕士学位论文].华中师范大学,2015. [24] Joyce B R, Showers B. TRANSFER OF TRAINING: THE CONTRIBUTION OF "COACHING"[J]. Journal of Education, 1982, 163(2):163-172. [25] 吕敏霞. 美国教师“同伴互助”与“同伴指导”辨析[J]. 外国中小学教育, 2007(11):34-37. [26] Slater C L, Simmons D L. The Design and Implementation of a Peer Coaching Program[J]. American Secondary Education, 2001, 29(3):67-76. [27] Fernandez C. Learning from Japanese Approaches to Professional Development: The Case of Lesson Study.[J]. Journal of Teacher Education, 2002, 53(5):393-405. [28] Lewis C. Does Lesson Study Have a Future in the United States[J]. Nagoya Journal of Education & Human Development, 2002, 3(1):25. [29] 张国朝. 浅谈备课之备学生、备教材、备教法[J]. 科教文汇, 2011(12):5-7. [30] 李瑾瑜, 赵文钊. “集体备课”:内涵、问题与变革策略[J]. 西北师范大学学报(社会科学版), 2011, 48(6):73-79. [31] 罗祖兵, 周婷婷. 教师集体备课的困境与突围[J]. 教育理论与实践, 2015(8). [32] 李国华. “同课异构”与“集体备课”嫁接的方式与作用[J]. 教学与管理:小学版, 2010(2):44-48. [33] 张作春, 徐永生. 教师专业“同质化”发展的原因及破解[J]. 教学与管理, 2012(32):8-9. [34] 熊星灿. 对中学地理集体备课再审视[J]. 中学地理教学参考, 2012(12):58-60. [35] 许昌良. 新课程背景下教师备课理念的调适[J]. 教学与管理, 2003(32):17-18. [36] 欧阳竟成. 基于内联网的教师集体备课系统[硕士学位论文]. 华东师范大学, 2001. [37] 崔景娜. 教师集体备课平台的设计研究[J]. 教育信息技术, 2008(9):38-39. [38] 陈宏原. 网络集体备课系统的设计与开发[硕士学位论文]. 扬州大学, 2014. [39] Knowles M S. The Modern Practice of Adult Education; Andragogy versus Pedagogy.[J]. Follett, 1996, 1980(4):156. [40] Thoonen E E J, Sleegers P J C, Oort F J, et al. How to Improve Teaching Practices: The Role of Teacher Motivation, Organizational Factors, and Leadership Practices[J]. Educational Administration Quarterly, 2011, 47(3):496-536. [41] 熊川武. 论反思性教学[J]. 教育研究, 2002(7):12-17. [42] Westheimer, Joel. "Communities and consequences: An inquiry into ideology and practice in teachers’ professional work." Educational Administration Quarterly 35.1 (1999): 71-105. [43] Stoll L, Bolam R, Mcmahon A, et al. Professional Learning Communities: A Review of the Literature[J]. Journal of Educational Change, 2006, 7(4):221-258. [44] Keller J M. Development and use of the ARCS model of instructional design[J]. Journal of instructional development, 1987, 10(3): 2. [45] Ryan R M, Deci E L. Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being[J]. American psychologist, 2000, 55(1): 68. [46] Kuo M S, Chuang T Y. How gamification motivates visits and engagement for online academic dissemination – An empirical study[J]. Computers in Human Behavior, 2016, 55:16-27. [47] Wang Y, Sun S. Assessing beliefs, attitudes, and behavioral responses toward online advertising in three countries[J]. International Business Review, 2010, 19(4): 333-344. [48] Klock A C T, Da Cunha L F, de Carvalho M F, et al. Gamification in e-learning systems: A conceptual model to engage students and its application in an adaptive e-learning system[M]//Learning and collaboration technologies. Springer, Cham, 2015: 595-607. [49] 教育部办公厅关于开展2014年度“一师一优课、一课一名师”活动的通知[EB/OL].(2014-07-01) [2018-03-15]. http://old.moe.gov.cn/publicfiles/business/htmlfiles/moe/s8001/201407/171300.html [50] 李佩佩. “一师一优课、一课一名师”活动的创新与发展研究[硕士学位论文]. 江苏师范大学, 2017. [51] 魏温远. 基于用户感知体验的在线培训系统可用性与使用意愿影响研究[D]. 浙江大学, 2013. [52] Fuchs I H. Open source software in higher education[J]. Educause review, 2010. [53] 李立贵, 黄立鹤, LILi-gui,等. 自我导向与自获资源驱动的自主学习平台探究——以德国不莱梅大学为例[J]. 现代教育技术, 2017(12). [54] Guàrdia L, Maina M, Sangrà A. MOOC Design Principles. A Pedagogical Approach from the Learner’s Perspective[J]. Elearning Papers, 2013, 33:1-6. [55] 李青, 刘娜. MOOC中教学视频的设计及制作方法——基于Coursera及edX平台课程的实证研究[J]. 现代教育技术, 2016(7):64-70. [56] Ma G I B. Didactics 2.0: A Pedagogical Analysis of Gamification Theory from a Comparative Perspective with a Special View to the Components of Learning [J]. Procedia - Social and Behavioral Sciences, 2014, 141:148-151. [57] Olsson M, Mozelius P, Collin J. Visualisation and Gamification of e-Learning and Programming Education.[J]. Electronic Journal of e-Learning, 2015, 13(6). [58] Carter M. Visible learning: a synthesis of over 800 meta‐analyses relating to achievement[J]. Educational Psychology, 2011, 29(7):867-869. [59] Tempelaar D T, Rienties B, Giesbers B. In search for the most informative data for feedback generation: Learning analytics in a data-rich context[J]. Computers in Human Behavior, 2015, 47(3):157-167. [60] 吴玲, 吴支奎. 有效生成根植于精心预设——新课程视阈下课堂教学改革的审思[J]. 课程.教材.教法, 2007(7):13-17. [61] 余文森. 有效教学十讲[M]. 上海: 华东师范大学出版社, 2009. [62] Gail Kinman, Siobhan Wray, Calista Strange. Emotional labour, burnout and job satisfaction in UK teachers: the role of workplace social support[J]. Educational Psychology, 2011, 31(7):843-856. [63] Gairín-Sallán J, Rodríguez-Gómez D, Armengol-Asparó C. Who exactly is the moderator? A consideration of online knowledge management network moderation in educational organisations[J]. Computers & Education, 2010, 55(1):304-312. [64] Ernest, Pauline\|Hopkins, Joseph. Coordination and Teacher Development in an Online Learning Environment.[J]. Calico Journal, 2006, 23(3):551-568. ﹀
馆藏号：	017/M2018(375)
公开日期：	2018-05-25

翻转课堂教学的游戏化设计和实证研究.吴丹

链接

题名：	翻转课堂教学的游戏化设计和实证研究
作者：	吴丹
学号：	1501210727
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-25
题目(外文)：	Research on the Design and Application of Gamified Flipped Classroom
关键字(中文)：	翻转课堂游戏化教育设计研究
关键字(外文)：	Flipped classroom Gamification Educational Design Research
文摘：	︿信息技术的飞速发展为教育研究开启了全新的方向，翻转课堂作为一种新型教学模式日益受到关注。翻转课堂颠覆了传统意义上教师主动讲解、学生被动听课的模式，学生课前通过教学视频学习到课程基础知识，课上通过小组协作、师生互动等活动促进知识内化。该模式被众多学者认为是促进学生自主学习，提高学生各种综合能力的有效方式。然而，在实际教学中该模式也存在一些问题。笔者作为《计算机辅助翻译原理与实践》课程的助教，通过课程学习、问卷调研及访谈将课程中存在的问题概括为：学生教学视频学习的积极性不高；研究性题目协作学习中贡献量不清，协作学习效果差。那么，如何提高学生翻转课堂学习的积极性，改善学生学习效果，同时提升翻转课堂学生的满意度，成为翻转课堂模式下开展教学所急需解决的问题。针对上述问题，笔者结合课程的教学要求和学习者需求，阐述了在现有教学模式中引入游戏化学习理念的必要性，明确了研究思路和研究方法，创新性地提出了基于“学生出题”的游戏化学习方法。本文提出的方法主要包括四个环节：学生根据教学内容进行出题、教师反馈、教师筛选题目、学生答题。为验证方法的有效性，笔者于2017年11月中旬到2018年1月初对北京大学语言信息工程系2017级36名研究生一年级学生开展为期一个半月的教学实验。笔者通过实验法和教育设计研究方法，分别对翻转课堂模式的“教学视频学习”和“研究性题目协作学习”两个部分进行教学实验的设计和实施，之后通过定量和定性分析对教学实验效果进行总结。教学实验结果表明，学生教学视频学习以及研究性题目协作学习的积极性得到改善；翻转课堂的学习效果得以提升，表现在学生教学视频学习的知识水平、思维能力以及学习成绩有所提高，协作学习模式下小组成员贡献量更为清晰，知识的广度和深度得到拓展；游戏化设计后的翻转课堂教学模式受到学生的欢迎。﹀
文摘（外文）：	︿ The technology-empowered flipped classroom has attracted an increasing attention, which is considered to be an effective substitute for its traditional counterpart. However, the teacher in a flipped classroom still has to deal with headaches like the students’ low motivation, unequal contribution from the group members and less-than-ideal collaboration. In order to enhance the students’ learning motivation, effects and satisfaction, this paper proposes a gamified approach in the flipped classroom based on the author’s experience of being the teaching assistant for the course “Principles and Practice of Computer Aided Translation”. The gamified approach in this paper features questions proposed by students themselves. Specifically, it includes four steps: 1) the students come up with their discussion questions; 2) the teacher gives feedback and evaluation of the questions; 3) the teacher selects questions; 4) the students prepare and answer the selected questions. To verify the effectiveness of the method, this paper conducts a teaching experiment of 36 graduate students of 2017 in the Department of Language and Information Engineering of Peking University from mid-November 2017 to early January 2018. The results are obtained through the quantitative and qualitative analysis of the teaching experiment. The results show that students' enthusiasm in instructional video learning and collaborative learning has been improved. Better learning effect of the flipped classroom has occurred, manifested in the improvement of the students’ knowledge level, thinking ability and learning performance in instructional video learning; a more balanced contribution of students is available, as well as the breadth and depth have been expanded in collaborative learning. The gamified flipped classroom teaching model gains popularity among students. In conclusion, the gamified approach proposed in this paper is effective. ﹀
分类号：	C33
论文总页数：	132
参考文献数：	109
参考文献：	︿ [1] Moraros J, Islam A, Yu S, et al. Flipping for success: evaluating the effectiveness of a novel teaching approach in a graduate level setting[J]. BMC medical education, 2015, 15(1): 27. [2] Yarbro J, McKnight P, Arfstrom K M, et al. Extension of a review of flipped learning[EB/OL]. (2014-6-4) [2017-12-15]. http://flippedlearning.org/wp-content/uploads/2016/07/Extension-of-FLipped-Learning-LIt-Review-June-2014.pdf [3] 金主彬. 基于翻转课堂的视听说课设计——以韩国大学本科高年级为例[D]. 北京大学, 2017. [4] Ramírez D, Hinojosa C, Rodríguez F. Advantages and disadvantages of flipped classroom: STEM students perceptions[C]//7th International Conference of Education, Research and Innovation ICERI, Seville, Spain. 2014: 17-19. [5] 陈晓菲. 翻转课堂教学模式的研究[D]. 华中师范大学, 2014. [6] Karabulut‐Ilgu A, Jaramillo Cherrez N, Jahren C T. A systematic review of research on the flipped learning method in engineering education [J]. British Journal of Educational Technology, 2017. [7] Halili S H, Abdul Razak R, Zainuddin Z. Enhancing collaborative learning in flipped classroom[J]. 2014. http://eprints.um.edu.my/11973/1/enhancing_collaborative_learning.pdf [8] Lee H, Doh Y Y. A Study on the Relationship between Educational Achievement and Emotional Engagement in a Gameful Interface for Video Lecture Systems[C]// International Symposium on Ubiquitous Virtual Reality. IEEE, 2012:34-37. [9] Gee J P. What video games have to teach us about learning and literacy[J]. Computers in Entertainment (CIE), 2003, 1(1): 20-20. [10] Graesser, Chipman, Leeming, et al. Deep learning and emotion in serious games[M]// Serious games: Mechanisms and effects. 2009:81-100. [11] Andrew Miller 5 Best Practices for the Flipped Classroom [EB/OL]. (2012-2-24) [2017-12-15]. https://www.edutopia.org/blog/flipped-classroom-best-practices-andrew-miller [12] 张金磊, 张宝辉. 游戏化学习理念在翻转课堂教学中的应用研究[J]. 远程教育杂志, 2013, 31(1):73-78. [13] 张屹, 周平红, 范福兰,等. 教育技术学研究方法[M]. 北京：北京大学出版社, 2013: 188, 201 [14] Talbert R. Inverting the linear algebra classroom [J]. Primus, 2014, 24(5): 361-374. [15] 何世忠. 科技改变课堂文化塑魂教育重庆市聚奎中学以“翻转课堂”为突破口推动学校整体改革的行与思[J]. 今日教育, 2013(10):15-17. [16] 张金磊, 王颖, 张宝辉. 翻转课堂教学模式研究[J]. 远程教育杂志, 2012, 30(4):46-51. [17] 钟晓流, 宋述强, 焦丽珍. 信息化环境中基于翻转课堂理念的教学设计研究[J]. 开放教育研究, 2013, 19(1):58-64. [18] Rivera V M. Flipped classrooms: advantages and disadvantages from the perspective of a practicing art teacher[D]. 2016. [19] Shinaberger L. Components of a Flipped Classroom Influencing Student Success in an Undergraduate Business Statistics Course[J]. Journal of Statistics Education, 2017, 25(3): 122-130. [20] 李海峰, 王炜. 弹幕视频:在线视频互动学习新取向[J]. 现代教育技术, 2015, 25(6):12-17. [21] 龙翔. 以问题为导向的翻转课堂自学效果的应用研究[D]. 北京大学, 2017. [22] 于晓微. “分组合作” 教学模式在高中化学课堂教学中的探究与实践[D]. 辽宁师范大学, 2013. [23] Chu S K W, Zhang Y, Chen K, et al. The effectiveness of wikis for project-based learning in different disciplines in higher education[J]. The Internet and Higher Education, 2017, 33: 49-60. [24] Tarricone G, Luca J. Successful teamwork:A case study[C]// ECU Publications Pre. 2011. Higher Education Research and Development Society of Australasia, 2002: 640-646. [25] Bandura A. Social cognitive theory of personality [J]. Personality & Social Psychology Review, 1999, 5(1):33-51. [26] 席勒. 审美教育书简[M]. 张玉能, 译. 江苏：译林出版社, 2009: 48 [27] 胡伊青加. 人：游戏者[M]. 贵阳：贵州人民出版社, 1998: 8-17 [28] Rollings A, Adams E. Andrew Rollings and Ernest Adams on game design[M]. New Riders, 2003: 34. [29] Smith-Robbins S. This game sucks: How to improve the gamification of education[J]. EDUCAUSE review, 2011, 46(1): 58-59. [30] 盖布·兹彻曼, 乔斯琳·林德. 游戏化革命[M]. 应皓, 译. 北京：中国人民大学出版社, 2014:25. [31] Kapp K M. The gamification of learning and instruction fieldbook: Ideas into practice[M]. John Wiley & Sons, 2013: 54. [32] Deterding，S.， Dixon，D.， Khaled，R.， et al. From Game Design Elements to Gamefulness： Defining Gamification[C]//Proceedings of the 15th International Academic MindTrek Conference： Envisioning Future Media Environments. ACM，2011: 9-15. [33] Reeves B, Read L J. Total Engagement: Using Games and Virtual Worlds to Change the Way People Work and Businesses Compete[M]. 2009. [34] Werbach K, Hunter D. For the win: How game thinking can revolutionize your business[M]. Wharton Digital Press, 2012. [35] Hanus M D, Fox J. Assessing the effects of gamification in the classroom: A longitudinal study on intrinsic motivation, social comparison, satisfaction, effort, and academic performance[J]. Computers & Education, 2015, 80: 152-161. [36] O’Donovan S, Gain J, Marais P. A case study in the gamification of a university-level games development course[C]//Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference. ACM, 2013: 242-251. [37] Dicheva D, Dichev C. Gamification in Education: Where Are We in 2015?[C]//E-Learn: World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education. Association for the Advancement of Computing in Education (AACE), 2015: 1445-1454. [38] Erenli K. The Impact of Gamification[J]. International Journal of Emerging Technologies in Learning, 2013, 8(1): 15-21. [39] Huang Hsin-Yuan，Dilip Soman.A Practitioner’ Guide To Gamification Of Education. [EB/OL]. (2013-9-10) [2017-12-15]. https://public.exceedlms.com/student/activity/312967 [40] Tang Weixuan. Self-Determination Theory and Gamified Online Homework System [D]. National Taiwan University，2013 [41] 凯文?韦巴赫, 丹?亨特. 游戏化思维:改变未来商业的新力量[M]. 浙江：浙江人民出版社, 2014: 58 [42] 牛玉霞, 任伟. 游戏化教学初探[J]. 软件导刊, 2006(10):4-5. [43] 石瀚文, 陈斌. 基于虚拟地理环境的游戏化学习系统与应用[J]. 地理信息世界, 2016, 23(3):77-82. [44] 李宜逊, 李虹, 德秀齐,等. 游戏化学习促进学生个性化发展的实证研究——以GraphoGame拼音游戏为例[J]. 中国电化教育, 2017(5):95-101. [45] Marc I. CICCHINO, USING GAME-BASED LEARNING TO FOSTER CRITICAL THINKING IN STUDENT DISCOURSE[D], The State University of New Jersey, 2013 [46] Tham R, Tham L. Game-based learning in Singapore higher education-A pilot study [J]. People: International Journal of Social Sciences, 2015, 1(1). [47] Mouaheb H, Fahli A, Moussetad M, et al. The serious game: what educational benefits? [J]. Procedia-Social and Behavioral Sciences, 2012, 46: 5502-5508. [48] Romero M, Usart M, Ott M. Can serious games contribute to developing and sustaining 21st century skills?[J]. Games and Culture, 2015, 10(2): 148-177. [49] Holly Bradbury, Gamification vs. Game-Based Learning: What’s the Difference? [EB/OL]. (2017-5-21) [2017-12-15]. http://www.theknowledgeguru.com/gamification-vs-game-based-learning/ [50] 肖海明.利用教育游戏培养学生创造力的设计与应用研究[D]. 北京大学, 2015 [51] 王梅艳. 基于Malone内在动机理论的小学课堂教育游戏软件的设计与开发[D]. 陕西师范大学, 2008. [52] 叶虹. 校本教育游戏软件的设计研究[D]. 上海师范大学, 2004. [53] 余英. 教育游戏在课堂教学中的应用研究[D]. 华中师范大学, 2007. [54] Deci E L, Ryan R M. Handbook of Self-Determination Research[M]// Handbook of self-determination research. University of Rochester Press, 2004. [55] Csikszentmihalyi, M. Play and intrinsic rewards[J].Journal of Humanistic Psychology, 1975, 15(3):41-63. [56] Skinner’s Reinforcement Theory. [EB/OL]. (2016-8-25) [2018-1-08]. https://managementmania.com/en/skinners-reinforcement-theory [57] Lola Koktysh. Game and gain: How neuroscience-based gamification helps to master chronic disease management [EB/OL]. (2016-8-22) [2017-11-08]. https://www.beckershospitalreview.com/healthcare-information-technology/game-and-gain-how-neuroscience-based-gamification-helps-to-master-chronic-disease-management.html [58] Perinot C, Gamification in the field of human resource management http://dspace.unive.it/bitstream/handle/10579/7886/833362-1191686.pdf;sequence=2 [59] 张腾. 企业游戏化管理的激励作用探析[D]. 北京大学, 2016 [60] 周郁凯. 游戏化实战[M]. 湖北：华中科技大学出版社, 2015: 45 [61] Mathrani A, Christian S, Ponder-Sutton A. PlayIT: Game Based Learning Approach for Teaching Programming Concepts[J]. Educational Technology & Society, 2016, 19(2): 5-17. [62] Why one professor created the first-ever social gaming platform for a MOOC https://blog.coursera.org/why-one-professor-created-the-first-ever-social/ [EB/OL]. (2013-10-18) [2018-1-15]. https://www.edutopia.org/blog/flipped-classroom-best-practices-andrew-miller [63] DomíNguez A N, Saenz-De-Navarrete J, De-Marcos L, et al. Gamifying learning experiences: Practical implications and outcomes[J]. Computers & Education, 2013, 63: 380-392. [64] McClean P, Saini-Eidukat B, Schwert D, et al. Virtual worlds in large enrollment science classes significantly improve authentic learning[C]//in: Proceedings of the 12th International Conference on College Teaching and Learning, Center for the Advancement of Teaching and Learning. 2001. [65] 尚俊杰, 张喆, 庄绍勇,等. 游戏化网络课程的设计与应用研究[J]. 远程教育杂志, 2012, 30(4):66-72. [66] 李柏锋，叶丙成谈借调创业：游戏化在五年后将成教育难以或缺的趋势[EB/OL].（2016-8-9） [2018-2-11]. https://www.inside.com.tw/2016/08/09/benson-pagamo [67] Krause M, Mogalle M, Pohl H, et al. A playful game changer: Fostering student retention in online education with social gamification[C]//Proceedings of the Second (2015) ACM Conference on Learning@ Scale. ACM, 2015: 95-102. [68] Barata G, Gama S, Jorge J, et al. Engaging Engineering Students with Gamification[C]// International Conference on Games and Virtual Worlds for Serious Applications. IEEE, 2013:1-8. [69] Denny P. The effect of virtual achievements on student engagement[C]//Proceedings of the SIGCHI conference on human factors in computing systems. ACM, 2013: 763-772. [70] 李丛，周少松. MOOC、游戏化思维结合的大学物理教学改革探究——以普通高校非物理专业为例[J].教育现代化，2016(11)：57-59. [71] 朱云, 裴蕾丝, 尚俊杰. 游戏化与MOOC课程视频的整合途径研究——以《游戏化教学法》MOOC为例[J]. 远程教育杂志, 2017(6): 95-103. [72] Latulipe C, Long N B, Seminario C E. Structuring flipped classes with lightweight teams and gamification[C]//Proceedings of the 46th ACM Technical Symposium on Computer Science Education. ACM, 2015: 392-397. [73] Rouse K E. Gamification in science education: The relationship of educational games to motivation and achievement[M]. The University of Southern Mississippi, 2013. [74] 孙彦彬. 游戏化翻转课堂教学模式的构建与实证研究——以“大学英语读写译”课程为例[J]. 现代教育技术, 2016, 26(11):80-86. [75] 张建强, 王晓政. 翻转课堂教学模式及游戏化实验系统研究[J]. 计算机教育, 2014(24):43-47. [76] 苗红意. 教育游戏在学科教学中的应用研究——以初中信息技术学科为例[D]. 浙江师范大学, 2006. [77] 苏亚玲. 基于游戏化学习的微课程设计研究[D]. 上海师范大学, 2014. [78] Christy K R, Fox J. Leaderboards in a virtual classroom: A test of stereotype threat and social comparison explanations for women’s math performance [J]. Computers & Education, 2014, 78: 66-77. [79] Williams, J. The Gamification Brain Trust: Intrinsically Motivating People to Change Behavior (part 2).[N/OL] (2012-9-22) [2017-12-17]. https://venturebeat.com/2012/09/22/the-gamification-brain-trust-intrinsically-motivating-people-to-change-behavior-part-2/ [80] Brown A L. Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings [J]. The journal of the learning sciences, 1992, 2(2): 141-178. [81] Wang F, Hannafin M J. Design-based research and technology-enhanced learning environments[J]. Educational technology research and development, 2005, 53(4): 5-23. [82] Design-Based Research Collective. Design-based research: An emerging paradigm for educational inquiry [J]. Educational Researcher, 2003, 32(1): 5-8. [83] 焦建利. 基于设计的研究:教育技术学研究的新取向[J]. 现代教育技术, 2008, 18(5):5-11. [84] 吴凡. EDR视野下对外汉语教育游戏的研发[D]. 华中师范大学, 2011. [85] Bakker A, van Eerde D. An introduction to design-based research with an example from statistics education[M]//Approaches to qualitative research in mathematics education. Springer, Dordrecht, 2015: 429-466. [86] Van den Akker, Jan, et al., eds. Educational design research[M]. Routledge, 2006:17. [87] Bannan-Ritland B. The role of design in research: The integrative learning design framework [J]. Educational researcher, 2003, 32(1): 21-24. [88] Reeves T C. Design research from a technology perspective [J]. Educational design research, 2006,1(3): 52-66 [89] Anderson T, Shattuck J. Design-based research: A decade of progress in education research? [J]. Educational researcher, 2012, 41(1): 16-25. [90] Zheng L. Reflection on Design-Based Research: Challenges and Future Direction[C]// The International Conference on Smart Learning Environment. 2016: 293-296. [91] Herrington J, Mckenney S, Reeves T, et al. Design-based research and doctoral students: Guidelines for preparing a dissertation proposal[C]// Titleworld Conference on Educational Media & Technology. 2007:1-9. [92] Crompton H, Dunkerly-Bean J, Giannakos M. Flipping the classroom in higher education: A design-based research study to develop a flipped classroom framework[C]//Society for Information Technology & Teacher Education International Conference. Association for the Advancement of Computing in Education (AACE), 2014: 2763-2766. [93] 张婷婷. 《数据结构与算法》翻转教学的设计与实践[D]. 南京师范大学, 2016. [94] 俞敬松, 王华树. 计算机辅助翻译硕士专业教学探讨[J]. 中国翻译, 2010 (3): 38-42. [95] 杨敏. 建构主义学习理论在化学教学中的应用[D]. 华中师范大学, 2006. [96] Baciu A C. From Role Play to Gamification as Educational Methods[C]// ERD 2016 - Education, Reflection, Development, Fourth Edition. 2016:35-40. [97] Schuwirth L W T, Verheggen M M, Van der Vleuten C P M, et al. Do short cases elicit different thinking processes than factual knowledge questions do?[J]. Medical Education, 2001, 35(4): 348-356. [98] Al-Rukban M O. Guidelines for the construction of multiple choice questions tests[J]. Journal of family & community medicine, 2006, 13(3): 125. [99] B.S. 布卢姆等著. 教育评价[M]. 邱渊等译. 上海:华东师范大学出版社, 1987: 90. [100] CecilR. Reynolds, RonaldB.Livingston, VictorWillson. 教育测量与评估[M]. 北京：科学出版社, 2015:124. [101] 周海银. 教学测量与评价[M]. 山东：山东大学出版社, 2015:53. [102] 杨丽静. 基于新课程理念的地理教学中创新性思维的培养[D]. 南京师范大学, 2004. [103] 王晓明. 教育心理学[M]. 北京：北京大学出版社, 2015:178 [104] Su F, Beaumont C. Evaluating the use of a wiki for collaborative learning[J]. Innovations in Education and Teaching International, 2010, 47(4): 417-431. [105] Lin C Y, Reigeluth C M. Scaffolding wiki-supported collaborative learning for small‐group projects and whole‐class collaborative knowledge building[J]. Journal of Computer Assisted Learning, 2016, 32(6): 529-547. [106] Rehm M, Littlejohn A, Rienties B. Does a formal wiki event contribute to the formation of a network of practice? A social capital perspective on the potential for informal learning[J]. Interactive Learning Environments, 2017: 1-12. [107]费拉拉J. (Ferrara, John). 好玩的设计游戏化思维与用户体验设计[M]. 汤海, 译. 北京：清华大学出版社, 2017. [108] Fullerton T. Game design workshop: a playcentric approach to creating innovative games[M]. CRC press, 2014: 30 [109] De Wever B, Van Keer H, Schellens T, et al. Assessing collaboration in a wiki: The reliability of university students’ peer assessment[J]. The Internet and Higher Education, 2011, 14(4): 201-206. ? ﹀
馆藏号：	017/M2018(430)
公开日期：	2021-05-25

基于语块和数据分析的高中英语写作一体化的教学研究.迟蕊沂

链接

题名：	基于语块和数据分析的高中英语写作一体化的教学研究
作者：	迟蕊沂
学号：	1601210480
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-25
题目(外文)：	Research on High School English Writing Integration Teaching Based on Chunk and Data Analysis
关键字(中文)：	写作一体化语块个性化反馈数据分析高中写作
关键字(外文)：	Writing Integration Chunk Personalized feedback Data analysis High school English writing
文摘：	︿在高中英语教学中，写作是不可或缺的教学环节。我国的写作教学在过去几十年中取得了显著的效果，但是仍然有较大的升空间。随着国内外诸多学者对写作教学的研究，“以读促写”这种新型写作一体化教学方式已经在大学、高职甚至一些高中写作课堂中得到实践并受到教师和学生的欢迎。但是这种教学模式仍然存在缺陷，主要表现为学生对阅读内容有较大的依赖性，不能使学生的个人能力在短时间内得到高。同时写作能力和阅读能力相较而言，学生在阅读方面的进步明显优于写作方面。因而， “以读促写”这种教学模式仍需继续探究和优化。本研究以高中英语写作教学为切入点，基于语块教学及第二语言相关研究成果，并结合移动端教学的新型教学方式，出了基于语块和数据分析的高中英语写作一体化的教学研究。该教学方法将阅读教学细化到语块层面，注重学生基本功的养成，同时注重学生个体差异及主体地位，形成属于每位学生的个性化反馈。本研究出的写作一体化教学主要包括五个部分:语块学习、限时写作、立即反馈、范文背诵以及个性化反馈。语块学习是指在学生写作之前供相关的语块学习材料;限时写作在移动端上进行，以便监控学生的写作时间和过程，把握写作一体化的进程;立即反馈是通过机器评分和人工评分相结合的方式对学生的作文内容进行及时评阅;范文背诵是指在学生进行“输入—输出”之后，再增加语言输入的环节，使写作教学的流程变为“输入—输出—再输入”的过程;个性化反馈是针对每位同学在作文中的错误形成个性化报告，并发放给学生，让学生意识到作文中的错误并尽量避免。为了对写作一体化的有效性进行验证，本研究在辽宁省本溪高中开展了为期近 3 个月的教学实验，以 50 名高中三年级的学生作为实验对象。实验组采用本研究出的基于语块及数据分析的写作教学方法，并记录学生在实验过程中的数据进行分析与评估;对照组采用传统写作教学的方式，即在写作课程中完成限时写作和讲评的环节。研究数据结果表明，本研究出的写作一体化教学方法确实可以高学生的写作成绩，并升他们的写作能力。值得庆幸的是本研究所供的教学材料被本溪高中高三教研组的老师使用并不断更新，写作一体化的部分教学流程也融入到现今的写作课堂中去。与此同时英语方舟考研 APP 的写作模块也采用了部分写作一体化的教学方法。写作一体化的教学方式真正地受到了教师和同学们的欢迎。﹀
文摘（外文）：	︿ English writing is an indispensable part in high school English teaching. Although the teaching of English writing in China has achieved remarkable results in the past decades, improvements still exist. With many scholars at home and abroad devoted themselves on the study of English writing, a new English writing integration teaching mode has been practiced and welcomed in colleges, higher vocational colleges and even in some high school writing classes. However, the read-to-write mode is still confronted with many defects according to the current teaching evaluation, such as exceedingly depending on the reading content making students cannot improve their individual abilities in a short period of time. Besides, compared with their reading and writing abilities, the distinct improvements for students are mainly focus on reading rather than writing. Consequently, the read-to-write mode in English writing class has not produced its advantage adequately. Focusing on English writing teaching and learning practices in senior high school, this paper adopts an English writing integration method based on chunks and data analysis in the teaching process, by using computer-aided tools. Theoretical and experimental fruits of chunks, English writing integration and second language acquisition also provide guidance for this paper. The writing integration study proposed here consists of five parts: chunk learning, time- limited writing, immediate feedback, model essay recitation and personalized feedback. The chunk learning is to provide related chunk materials before formal writing; The time-limited writing refers to the writing time is controlled in a certain time to promise the writing integration. Immediate feedback is to give every student timely reviews of their compositions by the combination of machine scoring and artificial scoring. After feedback, the teacher will deliver model essays for students to recite in time, which is the forth part. As for personalized feedback, the teacher will give a personalized report to every student to remind them avoiding the mistakes in their composition. II ABSTRACT In order to validate the effectiveness of the writing integration, a ten-week-long, for nearly three months, teaching experiment is conducted in the senior high school of Benxi City, Liaoning Province, with fifty students as the experiment targets. Consisting of twenty-five students in the experiment group, they adopt the teaching method proposed in this paper. Besides, this paper designs an online writing integration platform in order to make a comparation between the traditional writing class and the computer-aided class while the comparative group receive the writing classes and contents at the same time. However, they follow by the traditional method of English writing class. The feedback of them mainly focus on their compositions and scores. The final results indicate that the writing integration brings the better results for students’ writing grades and their writing abilities. More fortunately, the teaching materials collected up in this paper is utilized in the writing class of Benxi high school. Besides, some methods of this paper are also integrated in the current English writing class. The English writing integration is welcomed by both teachers and students. ﹀
分类号：	H08
论文总页数：	119
参考文献数：	115
参考文献：	︿ [1] 中华人民共和国教育部.普通高中英语课程标准(2017 年版).北京:人民教育出版社，2018. [2] Hanklin, N. Relating reading and writing: Developing a transactional theory of the writing process: Monograph in language and reading studies[M]. Bloomington, IN: Indiana University School of Education.1982. [3] Shanahan, T. Nature of reading-writing relationships: An exploratory multi-variate analysis[J]. Journal of Educational Psychology.1984: 76,466-477. [4] 杨谢友.依托课文阅读素材高学生写作能力[J].中学外语教学中学篇,2010:33-37. [5] Johnson, K. R. The Second Language Curriculum[M]. Cambridge: Cambridge University Press. 1989:1-5. [6] Widdowson. H. G. Teaching Language as Communication[M]. Oxford. 1978. [7] Stotsky, S. Research on Reading and Writing Relationships: Language Arts[J]. Georgia State University, 1983. [8] Merrill Swain, Sharon Lapkin. Problems in Input and Output and the Cognitive Processes They Generate: A Step Towards Second Language Learning. Applied Linguistics[J].1995:371-391. https://academic.oup.com/applij/article-abstract/16/3/371/184113?redirectedFrom=fulltext. [9] Wai-king. Tsang, Comparing the Effects of Reading and Writing on Writing Performance[J]. Applied Linguistics, 1990:19-20. [10] James, F. Reading and Writing Relationship: Assumptions and Directions[M]. Newbury House, 1985:25-40. [11] William E. Messenger & Peter A. Taylor. Essentials of Writing[M]. Ontario: Printice Hall Canada Inc, 1989:47-89. [12] 李振起,李凯源.大学英语写作[M].天津:南开大学出版社,1994:1-2. [13] 路德庆.普通写作学教程[M].北京:高等教育出版社,1994:34-45. [14] 罗明礼.国外外语写作教学法之回顾[J],国外理论动态,2008:96-99. [15] Neomy Storch. Collaborative writing: Product Wring Approach and Students’ Reflections[M]. Journal of Second Language Writing. 2005, 14:153-173. [16] Grabe,W, R.B.Kaplan. Theory and Practice of Writing: An Applied Linguistic Perspective[M]. London: Longman, 1996:44-45. [17] 张吉生.周平.英语写作教学中“结果法”与“过程法”的对比研究[J].北京:外语与外语教学, 2002, 9:22-45 [18] William E. Messenger, Peter A. Taylor. Essentials of Writing[M]. Ontario: Printice Hall Canada Inc, 1989:345-346. [19] 罗明礼.国外外语写作教学法之回顾[J],国外理论动态, 2008: 118. [20] Bereter. C, M. Scardamalia. The Psychology of Writing Composition[M]. Hillasdale, NJ: L. Lawrence Erlbaum, 1987:44-67. [21] Badger. R, White. G. A Process Genre Approach to teach Writing[J]. ELT Journal, 54, 2: 153-160. [22] 布鲁克斯,格伦迪.英语写作教学[M].人民教育出版社. 2000: 16-44. [23] Claudia Keh. Feedback in the Writing Process: A Model and Methods for Implementation, ELT Journal, Volume 44(4), Oxford University Press,1990:45-55. 78 参考文献 [24] Hans P Guth. The Writer as Agenda: The Wadsworth Writer as Guide and Handbook[M]. California: Wadsworth Inc ,1989:197-199. [25] Claudia Keh, A Design for a Process Approach Writing Course[M]. Teaching Forum, 1990:45-55. [26] Seow, A. The Writing Process and the Process Writing in Methodology in Language Teaching: An Anthology of current Practice[M]. CUP, 2002, 14:11-19. [27] Flower. I, Hayes, J. Cognitive Process of Theory of Writing[J]. College Communication, 1981, 34, 2:263-387. [28] Flowerdew, L. Using a genre-based framework to each organizational academic writing[J]. ELT Journal 54, 4:369-378. [29] Lee, Icy. Genre-Based Teaching and Assessment in Secondary English Classroom[J].Writing Classroom, 2006, 6, 5: 20-36. [30] Krenshen, S. Second Language Acquisition and Second Language Learning [M].Oxford: Pergamon,1981: 47-48. [31] Krenshen, S. The Input Hypothesis: Issues and Implications [M]. London: Longman, 1985:2. [32] Krenshen, S. Principles and Practice in Second Language Acquisition [M]. Oxford: Pergamon, 1982:21-22. [33] Krenshen, S. Second Language Acquisition and Second Language Learning [M].Oxford: Pergamon,1981: 58-59. [34] Krenshen, S. The Hypothesis: Issues and Implications[M]. London: Longman, 1983:22-23. [35] Krenshen, S, Terrell, T. The Natural Approach[M]. Oxford: Pergamon Press, 1983:45. [36] Ellis, R. Understanding Second Language Acquisition[M]. Oxford: Oxford University Press, 1985:26-38. [37] Krenshen, S, Scarcella, R. On routines and patterns in language acquisition and performance[J]. Language Learning. 1978, 2:283-300. [38] Swain, M. The Output Hypothesis: Just Speaking and Writing Aren’t Enough[J]. The Canadian Modern Language Review, 1993:158-164. [39] Swain, M, Lapkin, S. Problem in Output and the Cognitive Processed They Generate: A Step Towards Second Language Learning[J]. Applied Linguistics, 1995:34-67. [40] Swain, M. Three Functions of Output in Second Language Learning[J]. Oxford, England: Oxford University Press, 1995:125-144. [41] Harmer, Jeremy. The Practice of English Language Teaching[M]. New York: Longman Inc. 1993. [42] Johnson, K. R. The Second Language Curriculum[M]. Cambridge: Cambridge University Press. 1989: 1-27. [43] Long, M. H. Input Interaction and Second Language Acquisition Theory[R]. Paper presented at the University of Michigan Conference on Applied Linguistics. Michigan. 1983:113. [44] Gass, S. Input, Interaction and the Second Language Learner[J]. New Jersey: Lawerence Erlbaum Associates, 1997:56-66. [45] 赵平.Swain的输出假设对大学英语写作的指导意义[J].山东外语教学.2003,03. [46] 冯纪元,黄姣.语言输出活动对语言形式习得的影响[J].现代外语.2004,02:195-200. [47] 刘君栓,刘晓华夏晓翠.第二语言习得中的语言输入与语言输出[J].社会科学报.2005,03. [48] 李红.可理解输出的认知基础[J].外语与外语教学.2002,02:10-12. 79 北京大学硕士研究生论文 [49] Jean Piaget. The Principle of Constructive Theory[M]. 1981:117-119. [50] Bruner. The Effect of language Acquisition on Cognitive Structuring: a cognitive motivational model[J]. Pers Soc Psychol Rev, 2013:119-136. [51] Tomasello, Michael. The New Psychology of Language Vol.2: Cognitive and Functional Approaches to Language Structure[M]. Mahwah, N. J., London Lawrence Erlbaum Associate, Inc, 2003:12-19. [52] Yasmin B. Kafei, Mitchel Resnick. Constructionism in Practice: Designing, Thinking and Learning in Language Learning[M], Lawrence Erlbaum Associate. Inc, 1996:77-98. [53] 皮亚杰,卢睿.皮亚杰教育论著选.人民教育出版社,1990. [54] Lavatelli, C. S. Piaget’s Theory Applied to an Early Childhood Curricum[J]. Delta Education, Inc. 1970: 4-5. [55] Becker, J. The Phrasal Lexicon[M]. Cambride Mass: Bolt, Beranek and Newman, 1975. [56] Nattinger J. R., DeCarrico J.S. Lexical Phrases and Language Teaching[M]. London: Oxford Unibersity Press, 1992:1-12. [57] Wray, A. Formulaic Language and the Lexicon[M]. Cambridge: Cambridge University Press, 2002: 34-56. [58] Miller G. A. The Magical Number Seven Plus or Minus Two: Some Limits on Our Capacity for Processing Information[J]. Psychological Review, 1956, 63(2): 81-97. [59] Nattinger J. R., DeCarrico J.S. Lexical Phrases and Language Teaching[M]. London: Oxford Unibersity Press, 1992:45-67. [60] Palmer, Gray. J, Toward a Theory of Cultural Linguistics[M]. Austin: University of Texas, 1996. [61] Lewis, M. The Lexical Approach: The State of ELT and the Way Forward[M]. London: Language Teaching Publications, 1997: 227-245. [62] Lewis M. Implementing the Lexical Approach: Putting Theory into Practice[M]. London: Language Teaching Publications, 1993:34-35. [63] Moon, R. Vocabulary Connections: Multi-Word Items in English[A]. // N. Schmitt, M. McCathy. Vocabulary: Description, Acquisition and Pedagogy[C]. Shanghai: Foreign Language Education Press, 2002. [64] Wray, A. Formulaic Language and the Lexicon[M]. Cambridge: Cambridge University Press, 2002: 27-45. [65] Wray, A. Formulaic Language in Learners and Native Speakers[J]. Language Teaching, 1999, 32: 213-231. [66] 刁琳琳.英语本科生词块能力调查[J].外语学刊, 2008, 6: 50-63. [67] Hakuta, K. Prefabricated Pattern and the Emergence of Structure in Second Language Acquisition[J]. Language Learning, 1974:287-297. [68] Krashen, S. Second Language Acquisition and Second Language Learning[M]. Oxford: Pergam on Press, 1981:24. [69] Krashen, S. The Input Hypothesis: Issues and Implications[M]. London: Longman. 1985:11-46. [70] Pawley, A., Syder, F. Two Puzzles for Linguistic Theory: Nativelike Selection and Nativelike Fluency[A]. in Richard J., Schmidt R. Language and Communication[C]. London: Longman, 1983: 192-226. [71] Lewis, M. The Lexical Approach: The State of ELT and A Way Forward. Hove: Language Teaching Publications, 1993: 44-100. 80 参考文献 [72] Nattinger, J. R., Decarrico, J. S. Oxford Applied Linguistics Lexical Phrases and Language Teaching[J]. Politics, 1992:34-45. [73] 周艳.外语教学中的语块教学[J].基础英语教育,2007,9(1):8-11. [74] 蒋苏琴,郭洁.二语学习者语块运用与二语阅读水平的相关性研究[J].外语学刊,2016(3). [75] Cruse, D. A. Lexical Semantics[M]. Cambridge: The University Press, 1986:1-12. [76] Becker Joseph D. The Phrasal Lexicon[J]. in Proceedings of the 1975 workshop on Theoretical issues in natural language processing. Theoretical Issue in Natural Language Processing[C], Cambridge, Massachusetts, 1975: 60-63. [77] Bauer, A. English Word-Formation[J]. Cambridge: Cambridge University Press, 1983:41-43. [78] Morton Benson, Evelyn Benson, Rebert, F. The BBI Combinatory Dictionary of English: A Guide to Word Combinations[M]. John Benjamins Publishing Company, 1986:12-145. [79] Noam Chomsky. Knowledge of Language: It’s Nature, Origin, and Use[M]. Foreign Language Teaching and Research Press, 2002: 176-198. [80] 薛旭辉.认知语言学视域下的英语语块分类认知研究综述[J].西安:西安外国语大学学报,2012. [81] Nattinger J. R., DeCarrico J.S. Lexical Phrases and Language Teaching[M]. London: Oxford Unibersity Press, 1992:99-124. [82] L.V on. Bertalanff, Foundations, Development, Applications[J], 1968:77-90. [83] Ausubel. D. P, Educational Psychology, A Cognitive View[M]. 1978:45-48. [84] James, F., Diane, Lapp. Reading and Writing Relationship: Assumptions and Directions[A]. In James R. Squire (ed). The Dynamics of Language Learning[C]. Urbaball: ERIC Clearinghouse on Reading and Communication Skills, 1987: 1-2. [85] 蒋显菊.构建英语写作教学新模式-LSRW 教学法理论与实践[J].外语教学理论与实践.2006: 1- 19. [86] George W. Wilkin. Linguistics in Language Teaching. London: Longman, 1976: 56-78. [87] Flood James, Lapp. Reading and Writing Relationship: Assumptions and Directions[J]. The Dynamics of Language Learning, 1987:20. [88] Widdowson H, G. Teaching Language as Communication[M]. London: Oxford University Press, 1978: 45-67. [89] Wai-King Tsang. Comparing the effects of Reading and Writing on Writing Performance[J]. Applied Linguistics, 1990:45-46. [90] Stein, V. Elaboration: Using What You Know[J]. New York: Jessy House, 1990: 16. [91] Stotsky. Research on Reading and Writing Relationships: Language Arts[J]. 1987:17-19. [92] 谢薇娜.谈阅读与写作的交融性[J].外语教学, 1994, 4: 50-51(2). [93] 马广惠,文秋芳.大学生英语写作能力的影响因素研究[J].外语教学与研究,1999:35-40. [94] 中国社会科学院语言研究所.现代汉语词典(第六版).商务印书馆,2002. [95] A.S. Hornby.牛津高阶英汉双解词典.商务印书馆, 2016. [96] 丁言仁, 戚言.背诵课文在英语学习中的作用[J].外语界, 2005, 03:49-53. [97] Ellis, R. Understanding Second Language Acquisition[M]. Oxford: Oxford University Press, 1985: 40-45. [98] Scovel, J. English Teaching in China: A Historical Perspective[M]. Language Learning and Communication, 1983, 2: 105-109. 81 北京大学硕士研究生论文 [99] Skehan P.A. A Cognitive Approach to Language Learning[M]. UK: Oxford University Press, 1998: 15-16. [100]袁磊.语篇背诵对高高中生英语写作能力的实证研究[D].上海:华中师范大学, 2015. [101] Pennycook, A. Borrowing Other’s Words: Text, Ownership, Memory, and Plagiarism[J]. TESOL Quarterly, 1996, 30:201-230. [102] Parry, G. Repetition and Learning by Heart: An Aspect of intimate discourse, and it is implications[J]. ELT Journal, 1994, 4: 133-141. [103] Krashen, S. Principles and Practice in Second Language Acquisition[M]. Great Britain: A. Wheaton, Co.Ltd. Exeter, 1982: 55-67. [104]戴桢琼, 丁言仁.背诵课文在中国馆学生英语学习中的作用研究[J].外语研究, 2010, 2: 46-52. [105] Xiao Liu. A Study on the Effects of Reciting on Chinese College Students’ Oral English Proficiency[J]. Theory and Practice in Language Studies, 2011, 1: 1750-1755. [106] Hamer, J. The Practice of English Language Teaching[M]. New York: Longman Group Limited, 1983: 12-13. [107] Judy, S.N., S.J. Judy. An Introduction to the teaching of writing[J]. Illionois Scott Foresman Company, 1981:10-12. [108] 教育部考试中心.2018 年普通高等学校招生全国统一考试大纲的说明(文科)[M]. 高等教育出版社. 2018. [109]曲一线.高考英语核心词汇[M].首都师范大学出版社, 教育科学出版社, 2017. [110] [111] [112] [113] [114] [115] C.Levy,S.Ransdell, The Science of Writing: Theories, Methods, Individual Differences, and Applications[M]. AAAI Press, 1996:34-55. T. Silva, A Comparative Study of the Composing of Selected ESL and Native English Speaking Freshman Writers[C], Dissertation Abstracts International, 1990. T. Fellner, M. Apple, Eveloping Writing Fluency and Lexical Complexity with Blogs[J]. Journal of Second Language Writing, 2006:15-26. Jieun Chae, Ani Nenkova. Predicting the Fluency of Text with Shallow Structural Features: Case Studies of Machine Translation and Human-Written Text[J]. Pennsylvania: University of Pennsylvania Press, 2012:24-45. N.CHENOWETH, J. Hayes, Fluency in Writing Generating Text in L1 and L2[J]. Written Communication, 2001:45-63. L.Hamp-Lyons, W.Zhang, World English: Issues in and from Academic Writing Assessment, English for Academic Purposes. Research Perspective, 2001:19-91. ﹀
馆藏号：	017/M2018(552)
公开日期：	2021-05-25

面向读写一体化的英语写作系统的研究与设计.刘玥杉

链接

题名：	面向读写一体化的英语写作系统的研究与设计
作者：	刘玥杉
学号：	1501210626
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-25
题目(外文)：	Research and Design of Read-to-Write Integrated English Writing Application
关键字(中文)：	过程写作读写一体化激励合作学习
关键字(外文)：	Process writing Integration of reading and writing Incentive Cooperative learning
文摘：	︿英语写作能力是英语听、说、读、写四种基本能力之一，是语言综合能力的体现。进行写作训练有助于促进学习者语言知识的巩固与内化，使其语言能力得到全面性的提升。然而在教学实践中，存在输入与输出脱节、交互性不强、对学生的思维能力培养不足、写后修改不足等问题。此外，由于写作过程长、写作学习较为枯燥等原因，学生在进行写作训练的过程中，也存在语言基础知识积累不足、动力不足、缺乏兴趣、容易产生无助感、语言综合能力差等困难。随着互联网技术与移动应用的发展，移动产品因其便携性被逐渐应用于英语学习中。然而，市面上现有的写作系统多注重功能开发，并未结合相关教学理论与实践对整个写作过程进行优化，也不注重对学习者写作技巧与写作能力的培养。本系统基于对过程写作理论以及输入输出理论等相关理论的研究，结合实践中较为受认可的写作教学方式，经过对市面上竞品与部分功能近似产品的调查与分析后，针对传统教学中存在的问题及目标学生群体的写作需求进行设计。因系统功能较多，本文不能对每一设计细节都予以阐述，因而仅选择系统核心部分，即写作训练过程的设计、读写一体化的内容安排及学习方式进行详细说明，同时配合激励机制与轻社交化设计，以进一步提高用户学习兴趣，促进系统优化与完善。对于写作流程的设计，本文提出两种不同的设计方案，并从写作训练过程中选取三个具有代表性的页面分别提出不同的界面设计方案。基于以上设计，本文通过文献分析、实验观察、数据分析、访谈、调查问卷等方式分别论证了写作流程的不同方案与各界面设计的不同方案在降低用户认知负荷与促进学习目标达成等方面的有效性，并筛选出各不同方案中的最佳方案。总体来说，该系统克服了竞品的部分共性缺陷，优化了学生学习写作的流程，借助移动端的便捷性等优势，将输入学习与输出训练相结合，增强了师生间以及同伴之间的互动，并通过多样化的反馈以及激励设计促进学生进行修改，锻炼了学生的思维能力，丰富了学生的语言基础知识积累，全方位提高了学生的语言综合运用能力。此外，基于激励策略与合作学习原则设计的用户成长体系与社群学习机制等，有助于营造轻松的写作氛围，提升学生学习兴趣，增强写作的动力，缓解学生的无助感与焦虑感。该成果具有可扩展性，为移动环境下写作教学方法的设计与优化提供了建议，为后续关于英语写作系统的开发与应用提供了参考，对移动端教学与学习产品的设计具有一定的价值。﹀
文摘（外文）：	︿ English writing ability is one of the four basic abilities including listening, speaking, reading and writing. It embodies the comprehensive ability of language learning. Writing practice contributes to promoting the consolidation and localization of learners ' knowledge, so that their language skills can be comprehensively improved. However, in the teaching practice, there are problems such as the disconnection of input and output, the lack of interactivity, the lack of cultivation of students' thinking ability and the lack of modification after writing. In addition, due to the long writing process and boring learning process, students have difficulties such as insufficient accumulation of linguistic knowledge, lack of motivation, lack of interest, helplessness, and poor language comprehensive ability. With the development of Internet technology and mobile applications, mobile products are gradually being applied to English learning because of their portability. However, the existing writing systems in the application market pay more attention on function development. They neither optimize the writing learning process based one relevant teaching theory and teaching practice, nor focus on the cultivation of learners' writing skills and writing ability. The design of this application is based on the process writing theory, input and output hypothesis and other related theories, combined with recognized writing teaching methods in practice. What’s more, based on the investigation and analysis of competing products, the design of the application aims at solving the existing problems in the teaching and fulfilling the requirement of target student group. Due to the large number of system functions, this thesis cannot explain every design detail. Therefore, only the core parts of the system, that is, the design of the writing practice process, the content arrangement and the learning method of integration of reading and writing are explained in detail. At the same time, the incentive mechanism and the design of light sociality improve the users’ interest in learning and promote system optimization and improvement. As for the design of writing process, this thesis proposes two different design schemes, and selects three representative pages from the writing practice process and proposes different interface design solutions of each page. Based on the above design, this thesis demonstrates the validity of the different schemes of the interface design to reduce the user’s cognitive load and promote the achievement of learning objectives through literature analysis, experimental observation, data analysis, interviews, questionnaires, etc. and then filter out the best solutions in different scenarios. In general, the system overcomes some of the common defects of competing products, optimizes the writing learning process, and uses the advantages of the mobile terminal to combine input learning with output training, therefore enhances the interaction between students and teachers. Through a variety of feedback and motivational design, the system promotes students ‘ability of modification and thinking, enriches students' language knowledge, and improves the students' language comprehensive utilization ability. In addition, the user growth system and community learning mechanism based on motivational strategies and cooperative learning principles are conducive to creating a relaxed writing atmosphere, enhancing students' interest in learning, enhancing motivation for writing, and alleviating students' sense of helplessness and anxiety. The result is extensible. It provides optimization of writing methods in the mobile environment, provides reference for subsequent research, and is valuable in mobile teaching and the design of English learning products. ﹀
分类号：	G43
论文总页数：	83
参考文献数：	54
参考文献：	︿邓鹂鸣, 刘红, 陈艳,等. 2004. 过程写作法在大学英语写作实验教学中的运用[J]. 外语教学, 25(6):69-72. 邓鹂鸣, 刘红, 陈芃,等. 2003. 过程写作法的系统研究及其对大学英语写作教学改革的启示[J]. 外语教学, 24(6):58-62. 龚德英. 2009. 多媒体学习中认知负荷的优化控制[D]. 西南大学. 侯彩静, 苏鹏, 王叶芳. 2017. 同伴评价在大学英语过程写作教学中的应用研究[J]. 大学教育, (1):105-106. 黄渐法. 2017. 基于小组合作的英语写作教学探索[J]. 西部素质教育, 3(22):223-224. 姜炳生. 2003. 英语写作教学中的“过程写作教学法”再研究[J]. 西安外国语大学学报, 11(4):13-16. 赖慧云. 2015. 过程写作在高中英语写作分层教学中的尝试[J]. 基础外语教育, 17(6):76-80. 兰良平, 韩刚. 2014. 英语写作教学 : 课堂互动性交流视角[M]. 外语教学与研究出版社. 李金波, 许百华. 2009. 人机交互过程中认知负荷的综合测评方法[J]. 心理学报, 41(1):35-43. 林刚, 陈国江. 2007. 网络学习环境对认知负荷的影响及对策研究[J]. 中国远程教育:综合版, (8S):35-38. 麦春萍, 宋翠萍. 2017. 基于网络批改的大学英语写作过程教学法实践探索[J]. 科技视界, (8):62-62. 孟冬梅. 2013. 激励式外语教学设计[M]. 江西人民出版社. 倪清泉. 2009. 网络环境下基于协作学习的大学英语写作教学研究[J]. 外语电化教学, (3):63-68. 庞维国. 2011. 认知负荷理论及其教学涵义[J]. 当代教育科学, (12):23-28. 曲巍巍. 2016. 基于自动评分系统的协作式大学英语写作教学实证研究[J]. 亚太教育, (32):110-111. 秦晓晴，文秋芳. 2007. 中国大学生英语写作能力发展规律与特点研究[M]. 中国社会科学出版社. 任卓心. 2017. 多媒体学习中认知负荷测量方式分析[J].教育现代化, 4(18):130-131. 唐芳, 徐锦芬. 2011. 大学英语写作自我效能感调查与研究[J]. 外语界, (6):22-29. 吴荣辉, 何高大. 2014. 合作学习在大学英语写作教学中的应用效应研究[J]. 外语教学, 35(3):44-47. 吴育红. 2013. 同伴互评对自我效能感的影响——一项基于大学英语写作的实证研究[J]. 山东外语教学, (6):68-72. 吴育红, 顾卫星. 2011. 合作学习降低非英语专业大学生英语写作焦虑的实证研究[J]. 外语与外语教学, (6):51-55. 文秋芳. 2008. 输出驱动假设与英语专业技能课程改革[J]. 外语界, (2):2-9. 文秋芳，1996，英语学习策略论[M] ，上海外语教育出版社. 杨进中. 2012. 认知负荷理论视角的移动课程教学设计原则[J]. 现代远程教育研究, (3):86-90. 杨维东, 贾楠. 2011. 建构主义学习理论述评[J]. 理论导刊, (5):77-80. 张会萍. 2017. 网上同伴互评在英语写作能力发展中的积极作用[J]. 英语教师, 17(22):27-31. 钟云霞. 2017. 基于认知负荷理论的大学英语过程写作移动学习设计探析[J]. 德州学院学报, 33(1):29-33. 张志友. 2016. 同伴互评对学生写作自主性的激励[J]. 才智, (32). 周海银. 2015. 教学测量与评价[M]. 山东大学出版社. 朱立明. 2014. 认知负荷理论对大学英语写作教学的启示[J]. 科教导刊, (34):175-176. 张景伟. 2014. 基于输出驱动假设的大学英语写作教学模式[J]. 现代交际, (6):242-243. 赵翊君, 杨跃. 2013. 博客辅助英语专业过程写作教学模式的实证研究[J]. 外语电化教学, (5):46-51. 展鑫磊, 刘永兵. 2011. 中国学生英语合作学习因素与学习成绩的关系探究[J]. 中国外语教育, (1):12-21. 赵海红. 2010. “初中英语阅读写作一体化”教学模式初探[D]. 东北师范大学. Atkinson R K, Renkl A, Merrill M M. 2003. Transitioning from studying examples to solving problems: Effects of self-explanation prompts and fading worked-out steps[J]. Journal of Educational Psychology, 95(4): 774. D?rnyei Z.2001. Motivation strategies in the language classroom[M]. Ernst Klett Sprachen. Deci E L, Ryan R M.1975. Intrinsic motivation[M]. John Wiley & Sons, Inc. Gardner R C, Lambert W E.1959. Motivational variables in second-language acquisition[J]. Canadian Journal of Psychology/Revue canadienne de psychologie, 13(4): 266. Hein G. 1991. Constructivist learning theory[J]. Institute for Inquiry. Available at:/http://www. exploratorium. edu/ifi/resources/constructivistlearning. htmlS. Johnson D W, Johnson R T.1989. Cooperative learning: What special education teachers need to know[J]. The Pointer, 33(2): 5-11. Jacobs H L.1981. Testing ESL Composition: A Practical Approach. English Composition Program[M]. Newbury House Publishers, Inc., Rowley, MA 01969. Keh C L. 1990. Feedback in the writing process: A model and methods for implementation[J]. Krashen S D. 1985. The input hypothesis: Issues and implications[M]. Addison-Wesley Longman Ltd. Krashen S D. 1984. Writing, research, theory, and applications[M]. Pergamon. Leahy W, Sweller J. 2008. The imagination effect increases with an increased intrinsic cognitive load[J]. Applied cognitive psychology, 22(2): 273-283. Oxford R L. 1990. Language learning strategies and beyond: A look at strategies in the context of styles[J]. Shifting the instructional focus to the learner, 35-55. Paas F G W C, Van Merri?nboer J J G, Adam J J.1994. Measurement of cognitive load in instructional research[J]. Perceptual and motor skills, 79(1): 419-430. Quilici J L, Mayer R E.1996. Role of examples in how students learn to categorize statistics word problems[J]. Journal of Educational Psychology, 88(1): 144. Swain M.1995. Three functions of output in second language learning[J]. Principle & practice in applied linguistics. Swain M. 1985. Communicative competence: Some roles of comprehensible input and comprehensible output in its development[J]. Input in second language acquisition, 15: 165-179. Sweller J, Van Merrienboer J J G, Paas F G W C.1998. Cognitive architecture and instructional design[J]. Educational psychology review, 10(3): 251-296. Sweller J.1988. Cognitive load during problem solving: Effects on learning[J]. Cognitive science, 12(2): 257-285. Williams M., Burden R.1997. Psychology for language teachers[J]. Cambrdge: Cambridge. Zamel V.1983. The composing processes of advanced ESL students: Six case studies[J]. TESOL quarterly, 17(2): 165-188. ﹀
馆藏号：	017/M2018(573)
公开日期：	2021-05-25

官话方言翻译黑人英语的策略研究——以《绝非虚构：我的人生教训》为例.徐靖凯

链接

题名：	官话方言翻译黑人英语的策略研究——以《绝非虚构：我的人生教训》为例
作者：	徐靖凯
学号：	1501210746
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	李博婷
导师单位：	软件与微电子学院
答辩日期：	2018-05-25
题目(外文)：	STRATEGIES FOR TRANSLATING AFRICAN AMERICAN VERNACULAR ENGLISH INTO MANDARIN DIALECTS
关键字(中文)：	美国黑人英语官话方言方言翻译翻译策略
关键字(外文)：	African American Vernacular English Mandarin dialects Dialect translation Translation strategies
文摘：	︿美国黑人英语和标准美国英语在语音、词汇和语法方面存在明显差异，译者在翻译时需要找到恰当的方式以再现这种差异，否则原文的艺术效果将大打折扣。本文基于美国著名非裔单口喜剧演员凯文·哈特（Kevin Hart）的自传《绝非虚构：我的人生教训》（I Can't Make This Up: Life Lessons）一书的翻译实践，就如何在译文中使用现代汉语方言中的官话方言突出美国黑人英语的风格特色进行相关翻译原则和策略的研究。首先，笔者通过文献研究发现前人在美国黑人英语汉译实践中主要采用了三种翻译方法，即普通话法、语音飞白法和方言对译法。其次，笔者对45位受访者进行了访谈，了解他们对以上三种译法的看法、并结合名著译例具体了解他们对方言对译法的态度及建议。最后，综合文献研究和受众访谈，笔者提出本次翻译实践所遵循的翻译原则及所采用的翻译策略，并将代表性译例交予受访者评价。本文研究发现，大部分受访者对方言对译法持开放的态度，认为此译法虽然可理解度低于普通话法，但是能体现出原文中方言和标准语之间的差异；此译法虽然差异体现度低于语音飞白法，但是具有较高的可接受度。同时，受访者指出需要在体现方言风味的同时兼顾读者的可理解度。因此，笔者认为方言对译法具有可行性和接受度，并提出使用现代汉语七大方言中分布范围最广、使用人数最多以及通用性最强的官话方言对黑人英语进行翻译。综上所述，本文基于翻译实践提出了使用汉语官话方言翻译美国黑人英语时所遵循的三大翻译原则：（1）通用性，即使用大众熟知的官话方言翻译美国黑人英语，在不影响读者理解的前提下体现方言风味；（2）层次性，即根据原文中不同人物的语言特点，采用不同层次的官话方言进行翻译；（3）适度性，即不频繁密集地使用能体现方言风味的手段，以免给译文增添过度的方言风味，增加读者的阅读负担。同时，本文提出了以点带面、突出对比、方言谐音和行话翻译四种翻译策略，并将笔者译例交予受访者评价，得到了一定程度的认可。﹀
文摘（外文）：	︿ There are obvious differences between African American Vernacular English (AAVE) and General American (GA) in terms of pronunciation, vocabulary and grammar, and translators need to find appropriate ways to reproduce these differences, otherwise the artistic effects in the original text will be greatly reduced. This paper discusses the use of Mandarin dialects in reproducing the stylistic features of AAVE in Chinese translations based on the translation practice of Kevin Hart’s autobiography I Can't Make This Up: Life Lessons. First of all, through literature review, the author finds three translation methods adopted by predecessors, including the use of Putonghua, Chinese homonyms and Chinese dialects. Secondly, the author interviews 45 target readers to understand their views on the above-mentioned three methods and their overall attitudes towards the “Chinese dialects” method. Finally, combining literature review and audience interviews, the author proposes the translation principles and strategies for this translation practice. This paper thus concludes that most interviewees are open-minded about the “Chinese dialects” method and believe that although it is less understandable than the “Putonghua” method, it can reproduce differences; although it cannot compete with the “Chinese homonyms” method in terms of reproducing differences, it has a higher degree of acceptance. Therefore, the author believes that the “Chinese dialects” method is both feasible and acceptable and proposes to translate AAVE into Mandarin dialects, which are one of the seven dialects of modern Chinese with the biggest population, widest coverage and best universality. To sum up, this paper puts forward three principles for translating AAVE into Mandarin dialects, including universality, hierarchy and appropriateness. Meanwhile, four strategies are proposed, including the selective “point-for-area” strategy, “outstanding contrast” strategy, “Chinese dialect homonyms” strategy and “jargon translation” strategy. ﹀
分类号：	H087
论文总页数：	203
参考文献数：	66
参考文献：	︿陈胜利. 不似之似——论翻译中的方言借用[J]. 前沿, 2013(16): 127-129. 陈熙涵. “梗”字流行是个什么梗[EB/OL]. (2016-09-09) [2018-03-10]. http://whb.cn/zhuzhan/xinwen/20160909/68802.html 戴炜栋. 新编简明英语语言学教程[M]. 上海: 上海外语教育出版社, 2002: 114-119. 郭熙. 中国社会语言学[M]. 南京: 南京大学出版社, 1999: 148. 哈代. 张谷若译. 德伯家的苔丝[M]. 上海: 商务印书馆, 1936: 54. 哈代. 张谷若译. 德伯家的苔丝[M]. 北京: 人民文学出版社, 1984: 71. 韩子满. 试论方言对译的局限性——以张谷若先生译《德伯家的苔丝》为例[J]. 解放军外国语学院学报, 2002, 25(4): 86-90. 韩子满. 英语方言汉译初探[M]. 郑州: 河南大学出版社, 2004: 77-103. 黄伯荣, 廖序东. 现代汉语（增订三版）上册[M]. 北京: 高等教育出版社, 2002: 4. 黄景湖. 汉语方言学[M]. 厦门: 厦门大学出版社, 1987: 1. 黄忠廉. 方言翻译转换机制[J]. 北京理工大学学报 (社会科学版), 2012, 14(2): 144-147. 姜静. 国外方言翻译研究三十年: 现状与趋势[J]. 解放军外国语学院学报, 2016, 39(2): 123-131. 李劼人. 暴风雨前[M]. 北京: 人民文学出版社, 1982: 5. 李荣. 上海方言词典[M]. 南京: 江苏教育出版社, 1997: 80. 李荣. 官话方言的分区[J]. 方言, 1985(1): 2-5. 李如龙. 汉语方言学[M]. 北京: 高等教育出版社, 2001: 1. 林森. 论 “栋笃笑”的对话与狂欢——以《跟住去边度？》为例[J]. 中山大学研究生学刊: 社会科学版, 2011(2): 149-154. 刘叔新. 汉语描写词汇学[M]. 北京: 商务印书馆, 1990: 245-246. 马克·吐温. 张万里译. 哈克贝里·芬历险记[M]. 上海:上海译文出版社1984: 57. 马克·吐温. 成时译. 哈克贝利·费恩历险记[M]. 北京: 人民文学出版社, 1989: 89. 马克·吐温. 许汝祉译. 赫克尔贝里·芬历险记[M]. 南京: 译林出版社, 1998. 马格丽泰·密西尔. 傅东华译. 飘[M]. 杭州: 浙江人民出版社, 1979: 23. 玛格丽特·米切尔. 李美华译. 飘[M]. 南京: 译林出版社, 2008: 42. 戚雨春. 语言学百科词典[M]. 上海: 上海辞书出版社, 1993: 193. 钱曾怡. 汉语官话方言研究[M]. 济南: 齐鲁书社, 2010: 1. 曲彦斌. 俚语隐语行话词典[M]. 上海: 上海辞书出版社, 1996: 前言3. 斯陀夫人. 黄继忠译. 汤姆大伯的小屋[M]. 上海: 上海译文出版社, 1982. 斯陀夫人. 王家湘译. 汤姆叔叔的小屋[M]. 北京: 人民文学出版社, 1998. 宋启瑜. 脱口秀初探[J]. 曲艺, 2014(7): 15-16. 孙迎春. 张谷若翻译艺术研究[M]. 北京: 中国对外翻译出版公司, 2004: 21-22. 孙致礼. 再谈文学翻译的策略问题[J]. 中国翻译, 2003(1): 50-53. 唐枢. 蜀籁[M]. 成都: 四川人民出版社 1982: 194. 汪宝荣, 谢海丰. 西方的文学方言翻译策略研究述评[J]. 外国语文研究, 2016(4): 39-49. 汪平. 试论书面语与口语、方言、普通话的关系[J]. 中国方言学报, 2013(1): 206. 王孟包. 浅说伦敦方言[J]. 教学研究, 1986(01): 47-52. 王艳红. 美国黑人英语汉译研究——伦理与换喻视角[D]. 南开大学, 2010. 新华通讯社译名室. 英语姓名译名手册（第四版）[M]. 北京: 商务印书馆, 2004. 许宝华, 宫田一郎. 汉语方言大词典[M]. 北京: 中华书局, 1999. 徐朝晖. 当代流行语研究 [M]. 广州: 暨南大学出版社, 2013. 荀恩东, 饶高琦, 肖晓悦, 等. 大数据背景下 BCC 语料库的研制[J]. 语料库语言学, 2016(1): 95. 杨建国. 流行语的语言学研究及科学认定[J]. 语言教学与研究, 2004(6): 63-70. 游汝杰. 汉语方言学教程[M]. 上海: 上海教育出版社, 2004: 1. 袁家骅. 汉语方言概要（第二版）[M]. 北京: 文字改革出版社, 1983: 85. 赵晓阳. 汉语官话方言圣经译本考述[J]. 世界宗教研究, 2013(06): 77-86. 张大春. 认得几个字[M]. 上海: 上海人民出版社, 2009: 365-366. 郑石, 张绍刚. “单口喜剧”类节目的概念辨析及文化思辨[J]. 文艺评论, 2017(8): 108-113. 中国大百科全书总编辑委员会. 中国大百科全书: 戏曲曲艺[M]. 北京: 中国大百科全书出版社, 2002: 508. 中国社会科学院语言研究所词典编辑室. 《现代汉语词典》五十年[M]. 北京: 商务印书馆, 2004: 77-82. 中国社会科学院语言研究所，中国社会科学院民族学与人类学研究所，香港城市大学语言资讯科学研究中心.中国语言地图集[M]. 第二版.北京: 商务印书馆, 2012. 中国社会科学院语言研究所词典编辑室. 现代汉语词典. 第7版[M]. 北京:商务印书馆, 2016. Catford J C. A Linguistic Theory of Translation[M]. London: Oxford University Press, 1965: 85. Crystal D. The Story of English in 100 Words[M]. London: Profile Books, 2011: 359. Cuddon J A. The Penguin Dictionary of Literary Terms & Literary Theory[M]. New Jersey: Blackwell Pub, 1991: 217. Dean G, Allen S. Step by Step to Stand-Up Comedy[M]. London: Heinemann, 2000: 187-189. Green, Lisa J. African American English: A Linguistic Introduction[M]. Cambridge: Cambridge University Press, 2002: 6. Hatim B, Mason I. Discourse and the Translator[M]. London: Routledge, 1990: 45. Labov W. Academic Ignorance and Black Intelligence[J]. Atlantic, 1972: 59-67. Lanehart, Sonja, ed. The Oxford Handbook of African American Language[M]. Oxford: Oxford University Press, 2015: 1. Lipski J M. Y’all in American English: From Black to White, From Phrase to Pronoun[J]. English World-Wide, 1993, 14(1): 23-56. Montgomery M. The Etymology of Y’all[J]. Old English and New: Studies in Language and Linguistics in Honor of Frederic G. Cassidy, 1992: 356. Rickford, John R. African American Vernacular English Features, Evolution, Educational Implications[M]. New Jersey: Wiley-Blackwell, 1999. Rickford, John R. Spoken Soul: The Story of Black English[M]. New Jersey: John Wiley & Sons Inc, 2000. Rosa A A. Translating Place: Linguistic Variation in Translation[J]. Word & Text: A Journal of Literary Studies & Linguistics, 2012, 2(2). Sánchez, María T. Translation as A(n) (Im) possible Task: Dialect in Literature[J]. Babel 1999: 308. Skandera P, Burleigh P. A Manual of English Phonetics and Phonology: Twelve Lessons with An Integrated Course in Phonetic Transcription[M]. Tübingen: Gunter Narr Verlag, 2005: 60. Smith T W. Changing Racial Labels: From “Colored” to “Negro” to “Black” to “African American” [J]. Public Opinion Quarterly, 1992, 56(4): 496-514. ﹀
馆藏号：	017/M2018(249)
公开日期：	2021-05-25

葡萄酒文化的通俗化翻译——以《新索斯比葡萄酒大百科》为例.方一凡

链接

题名：	葡萄酒文化的通俗化翻译——以《新索斯比葡萄酒大百科》为例
姓名：	方一凡
学号：	1501210519
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-25
外文题名：	Popularization of Wine Culture in Translation
关键词：	葡萄酒通俗化翻译专业弱化
外文关键词：	Wine Popularization Translation De-professionalization
论文摘要：	︿目前中国葡萄酒市场发展迅速，消费者已呈现出年轻化、大众化趋势，然而葡萄酒文化传播尚处于萌芽阶段。为了吸引更多的普通消费者接触葡萄酒、了解葡萄酒并转化为消费，业内也出现了“通俗化”、“去专业化”的呼声，而国学、气象、财经、法律、科普等其他专业领域也有人进行过知识通俗化的尝试。在此背景下，笔者认为葡萄酒著作作为传播葡萄酒文化知识的重要载体，探索其通俗化翻译，有利于让更广阔的普通大众接触到浅近易懂的葡萄酒文化知识。因此，本文目的则是要基于 Tom Stevenson 所著《新索斯比葡萄酒大百科》（The New Sotheby’s Wine Encyclopedia）一书的英汉翻译实践，探索葡萄酒文化通俗化翻译的原则及策略。本文首先对两本畅销葡萄酒通俗读物的读者评论及通俗化策略进行了分析。接着，确定待译文本的受众定位及风格特色。之后，笔者与 15 位葡萄酒业内专业读者以及 1 位葡萄酒高级讲师进行了结构式访谈，了解其对于通俗化翻译的看法及建议。再邀请 15 位非专业读者中的 6 位对原始直译版译文进行了通读，圈定待通俗化处理的部分共计 83 处，并就“读后感受”及“通俗化需求”进行了简单交流。最后，笔者向《进口葡萄酒相关术语翻译规范》行业标准起草方，以及亚洲首位女性葡萄酒大师李志延的翻译团队就葡萄酒行业翻译标准进行了了解，并立足待译文本实际情况，提出了本次通俗化翻译原则及策略，后将通俗化后的典型译例交由上述 31 位读者阅读评价。本文研究发现，通俗化是相对于读者知识层次而言的普遍需求。鉴于本次待译部分对于非专业读者帮助更大，翻译时会更加向其诉求倾斜。但其实际需求、接受能力及喜好偏向存在差异，因而需立足原作风格特色，做到相对通俗化。综合各方观点，笔者提出本次翻译原则：（1）立足原作严谨专业的风格特色，在尊重专业性的基础上进行通俗化，因而不会做面目全非式的改写，将原作译成“低幼绘本”或“故事书”，而是通过解释、补偿等手段降低专业知识的难度，并且避免低俗化；（2）通俗化翻译相对于原文直译而言，且并非“一目了然、人人都懂”，过程中会向非专业读者的层次倾斜，但不等于一味迁就，最终实现相对的浅近易懂。并提出“补足策略”、“贴近策略”、以及“可视化策略”三类翻译策略，且译例也在读者间得到了一定程度的认可。﹀
外文摘要：	︿ Though the Chinese wine market is developing rapidly, the dissemination of wine culture is still in its infancy. To attract consumers, there have been calls for “de-professionalization” in the wine industry. The author believes that the popularization of wine culture makes the wine knowledge more accessible to the wider public, and the purpose of this paper is therefore to explore the principles and strategies for the popularization of wine culture in the Chinese translation based on the author’s translation of Tom Stevenson’s The New Sotheby’s Wine Encyclopedia. First of all, the paper studies the popularization strategies of two wine best-sellers. Next, the author interviews 16 wine professionals to understand their views and suggestions on the topic. Then, 6 of the 15 non-professionals have been invited to read the original translations and, as a result, they identified the total number of 83 translations to be revised in a popularized way. Finally, after studying the industrial standards for wine translation from the Chinese managerial authority as well as from the influential Asian Master of Wine Jeannie Cho Lee’s translation team, the author proposes the principles and strategies for his own translation, with the revised translations evaluated by the above 31 interviewees. This paper thus concludes that popularization is a general demand shared by almost all readers with varying degrees of knowledge about wine. Considering that non-professionals are the main target readers of this translation project, the author is more inclined to appeal to them but adheres to the principles as follows: (1) to show respect for professionalism and the original style of the translated book, and avoid a thorough rewriting as well as vulgarization; (2) to avoid yielding to all the readers’ needs, since popularization is a relative concept and absolute popularization is almost impossible to achieve. Also, the author puts forward translation strategies, namely “complementability”, “accessibility”, and “visibility”, and the popularized translation has won a relatively high degree of acceptance among the 31 interviewees. ﹀
分类号：	H087
论文总页数：	191
参考文献总数：	77
参考文献列表：	︿毕璐. 从葡萄酒酒标的构成谈酒标的翻译[J]. 大江周刊:论坛, 2013(5):177-177. 曹丽蒲, 周红. 葡萄酒文化传播的对策研究[J]. 中国商论, 2014(36):165-168. 陈彦孜. 经典作品的去经典化翻译[D]. 湖南大学, 2016. 陈军. 纽马克翻译理论视角下葡萄酒术语翻译研究——以《品酒:罗宾逊品酒练习册》为例[D]. 对外经济贸易大学, 2016. 邓永标. 国学经典图书出版的策划与推广[J]. 出版广角, 2017(20). 冯俊. 创新喝法去专业化敞开葡萄酒消费大门[J]. 新食品, 2016(37):64-68. 郭建中. 重写:科普文体翻译的一个实验——以《时间简史》(普及版)为例[J]. 中国科技翻译, 2007, 20(2):1-6. 郭建中. 科普与科幻翻译[M]. 北京:中国对外翻译出版公司, 2004. 胡红辉. 对经典古籍通俗化版本出版发展的几点认识——以《论语》通俗化版本出版为例[J]. 出版广角, 2015(1):88-91. 贺群. 再谈翻译的通俗性[J]. 语言与翻译(汉文版), 2002(1):54-56. 黄蓉.凰家酒咖\|如何让葡萄酒更有中国年味儿？走下神坛，通俗易饮[EB/OL].（2017-01-26）[2018-03-02]. http://jiu.ifeng.com/a/20170126/44536996_0.shtml. 金学勤. 通俗简练瑕不掩瑜——评戴维?亨顿的《论语》和《孟子》英译[J]. 孔子研究, 2010(5):117-123. 姜秋勇. 从巴斯奈特翻译理论看进口葡萄酒品牌的翻译[J]. 中国培训, 2016(20):279-280. 冷冰冰. 编译策略在科普杂志翻译中的应用[J]. 中国科技翻译, 2017, 30(4):5-8. 冷冰冰. 科普杂志翻译规范研究[D].上海外国语大学,2017. 林元彪. “碎片化阅读”时代中国典籍的翻译策略——以赖发洛《论语》英译本的“近译”策略为例[J]. 上海师范大学学报(哲学社会科学版), 2015(5):143-152. 刘悠翔.都说冯唐翻译的《飞鸟集》缺乏信达雅，他是不是玩砸了？[EB/OL].（2016-01-08）[2018-03-02]. http://static.nfapp.southcn.com/content/201601/08/c33583.html. 刘玲. 语境对译文的影响——葡萄酒行业翻译案例分析[J]. 新校园旬刊, 2011(1):180-181. 刘进才. 语言文学的现代建构[M]. 北京大学出版社, 2015. 李华. 葡萄与葡萄酒词典[M]. 西北农林科技大学出版社, 2013. 李志延. 东品西酿:首位亚裔葡萄酒大师教你品尝葡萄酒[M]. 戴鸿靖, 严轶韵等, 译. 北京: 中信出版社, 2012. 罗宾逊. 品酒:罗宾逊品酒练习册[M]. 吕杨, 吴岳宣, 译. 上海:上海三联书店, 2011. 罗宾逊, 约翰逊. 世界葡萄酒地图[M]. 林裕森, 陈匡民等, 译. 台北:积木文化出版社, 2010. 罗粲文. 关于葡萄酒英文指南翻译的实践报告[D]. 华北电力大学, 2017. 茅盾. 通俗化、大众化与中国化[J]. 新疆社会科学, 1983(2):98-100. 木艳娟, 于陈辰. 翻译目的论视角下的法译汉——以法国葡萄酒产业为例[J]. 西南民族大学学报: 人文社会科学版, 2012 (S1): 118-121. 马晶晶. 《波尔多传奇》翻译中文化障碍的化解[D]. 烟台大学, 2016. 马红. 《葡萄酒圣经》(节选)翻译实践报告——葡萄酒术语的翻译[D]. 宁夏大学, 2015. 帕克特, 海默克. 把这瓶开了!一看就懂的葡萄酒品鉴、配餐、选购指南[M]. 王琰, 译. 南京:江苏凤凰文艺出版社, 2016. 石田博. 你不懂葡萄酒[M]. 张暐, 译. 南京:江苏凤凰文艺出版社，2016. 申明.浅析科普新闻传播中的多元化特征[J].新闻知识，2013（4）：15-16. 宋福彬.《葡萄酒质量:品尝与挑选》的翻译实践报告[D]. 重庆大学, 2016. 唐文龙. 构建有"中国印记"的葡萄酒文化[J]. 酿酒科技, 2007(4):133-134. 唐星宇, 潘耀清, 关鸿志. 用修辞手法实现气象新闻报道语言的通俗化[J]. 广东气象, 2009, 31(5):39-40. 王建开. 经典与当代的融合:中国文学作品英译的通俗形态[J]. 当代外语研究, 2014(10):49-53. 王建开. 借用与类比:中国文学英译和对外传播的策略[J]. 外文研究, 2013(1). 王春景. 原本通俗,译本难懂[J]. 中国图书评论, 2005(1):32-34. 王仕佐, 黄平. 论中国的葡萄酒文化[J]. 酿酒科技, 2009(11):136-143. 王江松.吴书仙：创造中国的葡萄酒文化[EB/OL].（2012-07-16）[2018-03-02]. http://www.bizwines.com/a/wenhua/renwu/2012/0716/4202.html. 王文丽. 论专业文本如何保持译本专业性的问题——以葡萄酒法规项目(欧盟1234/2007号条例)拟议修正案汉译为例[D]. 上海外国语大学, 2016. 王鵬. 法語葡萄酒結構語彙的漢譯[J]. 廣譯: 語言, 文學, 與文化翻譯, 2009 (2): 127-170. 王文杰. 我国进口葡萄酒名称的翻译策略研究[D]. 沈阳师范大学, 2013. 吴粤汕, 战吉宬. 中西方葡萄酒文化的差异与融合[J]. 中国食品, 2006(14):44-44. 邢力. 民族文学文化典籍的通俗经典化传播——评《蒙古秘史》罗依果英译[J]. 民族文学研究, 2010(3):151-159. 姚俊英. 如何让科技成果转化成科普新闻[J].新闻实践，2009(11):75. 杨柳. 通俗翻译的“震惊”效果与日常生活的审美精神——林语堂翻译研究[J]. 中国翻译, 2004(4):42-47. 杨建国, 项朝晖. 宋元讲史话本的通俗化特征初探[J]. 中国文化研究, 2000(1):71-77. 杨吉华. 国内葡萄酒文化传播存在的问题及对策——以山东省蓬莱市为例[J]. 中外葡萄与葡萄酒, 2011(12):4-9. 易丹.葡萄酒市场文化推广的四大误区[EB/OL].（2016-07-01）[2018-03-02]. http://www.wines-info.com/html/2016/7/183-66035.html. 于士清. 酒类术语英汉翻译浅析[J]. 英语广场旬刊, 2017(7):9-11. 闫志杰. “巴西葡萄酒市场综合研究”的翻译实践报告[D]. 西北师范大学, 2015. 闫宇涵. An Analysis of Translation of The World's Greatest Wine Estates (Excerpt)[D]. 对外经济贸易大学, 2012. 周娜. 当前我国古籍普及读物出版的通俗化研究[D]. 华中科技大学, 2016. 张敬源, 邱靖娜. 后现代语境下民族典籍翻译的通俗化改写——评首个国内《玛纳斯》英译本[J]. 外国语文, 2016, 32(6):136-142. 张淑芳. 财经新闻"去专业化"处理技巧探微[J]. 东南传播, 2010(11):111-113. 张志伟.“红酒名人”董树国：从媒体到经商[EB/OL].（2015-11-05）[2018-03-02]. http://www.yicai.com/news/4707900.html. 醉鹅娘.用8小时高能的课程，向50年的“哑巴红酒”教育宣战[EB/OL].（2018-01-19）[2018-03-02]. https://mp.weixin.qq.com/s/k4ibIPHQK-8VemN4sIf5BA. 中国社会科学院语言研究所词典编辑室. 现代汉语词典. 第7版[M]. 北京:商务印书馆, 2016. 朱玉增. 翻译趣闻中的进口葡萄酒营销[J]. 中外葡萄与葡萄酒:文化版, 2010(2):104-107. 周明. 论文本分析和翻译策略——《干邑》节译报告[D]. 烟台大学, 2013. Ariel, M. Accessibility Theory: An Overview [J]. Text Representation: Linguistic and psycholinguistic Aspects, 2001, 8: 29-87. Calsamiglia, H. Popularization Discourse [J]. Discourse Studies, 2003, 5(2):139-146. Calsamiglia, H, Dijk, T. Popularization Discourse and Knowledge about the Genome[J]. Discourse & Society, 2004, 15(4):369-389. Forget, L. "At Best an Echo": Eighteenth-and Nineteenth-Century Translation Strategies in the History of Economics [J]. History of Political Economy, 2010, 42(4):653-677. Gotti, M. Reformulation and Recontextualization in Popularization Discourse[J]. Ibérica, 2014, 27(27):15-34. Grego, K. The Physics You Buy in Supermarkets: Writing Science for the General Public: the Case of Stephen Hawkings[C]// Kermas S. The Popularization of Specialized Discourse and Knowledge across Communities and Cultures[M]. Bari: Edipuglia, 2013: 149-172. Liao, M. Popularization and Translation[C]//Gambier Y, Van L. Handbook of Translation Studies [M]. Amsterdam: John Benjamins Publishing, 2010:130-133. Liao, M. Interaction in the Genre of Popular Science[J]. Translator, 2011, 17(2):349-368. Montalt, V, Gonzalez-Davies M. Medical Translation Step by Step: Learning by Drafting [M]. Manchester: St. Jerome, 2007. Merakchi, K, Rogers, M. The Translation of Culturally Bound Metaphors in the Genre of Popular Science Articles: A Corpus-based Case Study from Scientific American Translated into Arabic[J]. Intercultural Pragmatics, 2013, 10(2):341-372. Shuttleworth, M. Translational Behaviour at the Frontiers of Scientific Knowledge[J]. Translator, 2011, 17(2):301-323. Víctor, G. Trying to See the Wood Despite the Trees: A Plain Approach to Legal Translation[C]//Cheng L, Sin K, Wagner A. The Ashgate Handbook of Legal Translation [M]. Surrey: Ashgate, 2014:71-88. Wine Intelligence. Wine Intelligence China Portraits 2015[R]. Wine Intelligence, 2015. Wine Intelligence. Wine Intelligence China Portraits 2017[R]. Wine Intelligence, 2017. ﹀
馆藏号：	017/M2018(251)
公开日期：	2018-05-25

汉学社科类著作中本源概念的翻译研究——以《珠三角的女儿》为例.李文婷

链接

题名：	汉学社科类著作中本源概念的翻译研究——以《珠三角的女儿》为例
姓名：	李文婷
学号：	1501210591
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-25
外文题名：	Translating Indigenous Concepts in Sinological Social Science Works—A Case Study of Daughters of the Canton Delta
关键词：	汉学社科著作本源概念翻译策略不落家自梳
外文关键词：	Sinological social sciences Indigenous concepts Translation strategies Delayed transfer marriage Sworn spinsterhood
论文摘要：	︿本次翻译实践基于贾妮丝·斯托卡德所著的《珠三角的女儿：1860-1930 年中国华南地区的婚姻模式和经济策略》。这是一本汉学社科类著作，介绍了 19 世纪末 20 世纪初存在于珠江三角洲的不落家婚俗和由此衍生的自梳女群体。由于该书包含大量当地特有、难为其他地区的人所理解的本源概念，而本源概念在跨文化交际中不可避免，而且迄今为止，本源概念在翻译研究和实践中运用还不多，学界尚未大量开展这方面的研究，因此，笔者选择研究汉学社科类著作中本源概念的翻译策略。在介绍本源概念必要的背景信息后，笔者首先分析作者在异语写作过程中对本源概念的英译策略和模式，探讨其中的翻译难点。笔者发现，作者为了尽可能保留当地文化特色，不仅主要使用音译和直译的方法，更创造单词和短语。基于此，笔者提出在翻译汉学社科类著作时，应以归本还源为翻译宗旨，遵循本土性、可理解性、专业性和明确性的翻译原则，采取定位概念、灵活还原、合理仿造和适度改写的翻译策略，并通过译例论证上述原则和策略对于解决翻译难点的合理性和有效性。此外，笔者还分析了因多套拼音方案混用和拼写错误造成的另一翻译难点。在具体的翻译过程中，笔者综合运用了权威书籍、平行文本和语料库这三类翻译工具，试图从词汇、句子和文化背景层面准确传达原文的意思，保证译文质量。﹀
外文摘要：	︿ This translation project is based on Daughters of the Canton Delta: Marriage Patterns and Economic Strategies in South China, 1860-1930, written by Janice Stockard. It is a sinological social science work, introducing a unique marriage pattern, namely delayed transfer marriage, and a special community of those women that once existed in the Canton Delta. Since the book contains many indigenous concepts alien to people from other regions and cultures, and up to now, there has few researches on such subject, this present paper discusses the translation principles and strategies of indigenous concepts common in sinological social science works. After briefing the background information of indigenous concepts, this paper analyzes how Stockard handles these concepts in her writing. It is found that to preserve local cultural features, Stockard mainly uses strategies of transliteration, literal translation and neologism to create new words and phrases. The paper is inspired to propose that, when translating sinological social science works, translators should aim for aboriginality. To achieve it, the paper further puts forward four translation principles which emphasize nativeness, understandablity, technicality and clarity, and also four translation strategies, namely making glossary to localize indigenous concepts from parallel texts, flexibly back-translating concepts, reasonably modelling concepts on existing expressions, and moderating concepts in an appropriate way. The rationality and validity of the above principles and strategies are proved through examples from the translated text. Besides, this paper also analyzes difficulties caused by the mixture of various sets of Romanization Scheme. To produce a high-quality translated text, this paper makes full use of anthropological textbooks and dictionaries, parallel texts and different Chinese corpus. ﹀
分类号：	H087/TP391
论文总页数：	195
参考文献总数：	73
参考文献列表：	︿ [1] SKINNER G W. What the study of China can do for social science?[J]. Journal of Asian Studies, 1964, 23(4): 519. [2] 王宏印. 文学翻译批评概论[M]. 中国人民大学出版社, 2009. [3] 王宏印, 江慧敏. 京华旧事,译坛烟云——Moment in Peking的异语创作与无根回译[J]. 外语与外语教学, 2012, (2): 65-69. [4] 王宏印. 从“异语写作”到“无本回译”——关于创作与翻译的理论思考[J]. 上海翻译, 2015, (3): 1-9. [5] WALLERSTEIN I. Concepts in the social sciences: problems of translation[M]//ROSE M G. Translation spectrum: Essays in theory and practice. State University of New York Press, 1981: 88. [6] HEIM M H, TYMOWSKI A W. Guidelines for the translation of social science texts[M]. American Council of Learned Societies, 2006. [7] 何元建. 论本源概念的翻译模式[J]. 外语教学与研究, 2010, (3): 211. [8] TOPLEY M. Marriage resistance in rural Kwangtung[M]//TOPLEY M, DEBERNARDI J. Cantonese Society in Hong Kong and Singapore: Gender, Religion, Medicine and Money. Hong Kong University Press, 2011: 423-446. [9] 庄英章, 武雅士. 华南地区的婚姻形态 (1930~1950) 区域性的比较初探[M]//庄英章. 华南农村社会文化研究论文集. 中央研究院民族學研究所, 1998: 11-34. [10] 李亦园. 人类学与现代社会[M]. 水牛图书出版事业有限公司, 1992. [11] 邱正略. 日治时期户口调查簿资源检索系统的建立与运用[M]//项洁. 从保存到创造: 开启数位人文研究. 国立台湾大学出版中心, 2011: 254-280. [12] 李宁利. 相约独身:文化地理视角下的珠江三角洲自梳女研究[D]. 中山大学, 2004. [13] 李宁利, 周玉蓉. 珠江三角洲“自梳女”兴起背景探析[J]. 云南社会科学, 2004, (4): 89-93. [14] 叶春生. 珠三角的“自梳女”[J]. 肇庆学院学报, 2000, (4): 67-70. [15] 简美玲, 刘涂中. 坐家、菜姑、自梳女:人观、女性结群与中国南方婚后双居的区域性初探[J]. “区域社会与文化类型”国际学术研讨会, 2007. [16] SIU H F. Where were the women? Rethinking marriage resistance and regional culture in South China[J]. Late Imperial China, 1990, 11(2): 32-62. [17] WATSON R S. Girls’ houses and working women: expressive culture in the Pearl River Delta, 1900-41’[J]. Women and Chinese Patriarchy: Submission, Servitude and Escape London: Zed Books, 1994: 25-44. [18] HERSHATTER G. Women in China's long twentieth century[M]. Univ of California Press, 2007. [19] 李长栓. 非文学翻译[M]. 外语教学与研究出版社, 2009: 91. [20] HE Y. Mapping culturally indigenous concepts in the translation process: A cognitive perspective[J]. Journal of Translation Studies, 2004, 9: 33-55. [21] HE Y. A fresh cognitive perspective to horizontal translation[J]. Journal of Translation Studies, 2007, 10(1): 77-90. [22] DE GROOT A M. The cognitive study of translation and interpretation[M]//DANKS J H. Cognitive Processes in Translation & Interpretation. Sage Publications Inc, 1997: 56. [23] 何元建. 论本源概念的翻译模式[J]. 外语教学与研究, 2010, (3): 215. [24] 何元建. 翻译认知过程中的两种编码机制[J]. 外语与翻译, 2008, (1): 8. [25] 何元建. 论本源概念的翻译模式[J]. 外语教学与研究, 2010, (3): 214. [26] 俞森林, 凌冰. 东来西去的《红楼梦》宗教文化——杨译《红楼梦》宗教文化概念的认知翻译策略[J]. 红楼梦学刊, 2010, (6): 79-99. [27] 赵雅, 张利, 戈玲玲. 林语堂作品中本源概念的翻译探讨——以Lady Wu及其汉译本为例[J]. 海外英语, 2016, (14): 113-115. [28] 胡瑶, 张利, 戈玲玲. 论《长生殿》中本源概念的翻译模式[J]. 新西部旬刊, 2017, (10). [29] 王晓惠. 基于“本源概念”的古诗英译意象重构技巧[J]. 广西社会科学, 2012, (4): 164-167. [30] NIDA E. Toward a Science of Translation[M]. Brill Leiden, 1964. [31] AIXELá J F. Culture-specific items in translation[M]//áLVAREZ R, VIDAL M C á. Translation, Power, Subversion. Multilingual Matters Clevedon, 1996: 52-78. [32] 廖七一. 当代西方翻译理论探索[M]. 译林出版社, 2000: 232. [33] 王德春. 汉语国俗词典[M]. 河海大学出版社, 1990. [34] 王东风. 连贯与翻译[M]. 上海外语教育出版社, 2009: 223. [35] 贺微. 翻译:文本与译者的对话[J]. 外国语, 1999, (1): 41. [36] 刘明东, 秦岭. 图式在翻译过程中的运用[J]. 外语教学, 2002, 23(6): 56. [37] 大卫?费特曼. 民族志：步步深入(第3版)[M]. 重庆大学出版社, 2013: 72. [38] 简美玲, 刘涂中. 坐家、菜姑、自梳女:人观、女性结群与中国南方婚后双居的区域性初探[J]. “区域社会与文化类型”国际学术研讨会, 2007: 3. [39] 林惠祥. 論长住娘家风俗的起源及母系制到父系制的过渡[J]. 厦门大学学报:社会科学版, 1962, (4): 37-38. [40] 马建钊, 乔健, 杜瑞乐. 华南婚姻制度与妇女地位[M]. 广西民族出版社, 1994. [41] 范宏貴. 談談“坐家”和“不落夫家”[J]. 史学月刊, 1959, (12): 32-34. [42] 陈遹曾, 黎思复, 邬庆时. “自梳女”与“不落家”[J]. 文史春秋, 1994, (3): 41. [43] 苏耀昌. 华南丝区:地方历史的变迁与世界体系理论[M]. 中州古籍出版社, 1987. [44] STOCKARD J E. Daughters of the Canton Delta: Marriage Patterns and Economic Strategies in South China, 1860-1930[M]. HongKong University Press, 1989. [45] 林惠祥. 林惠祥人类学论著[M]. 福建人民出版社, 1981: 254-288. [46] 司徒尚纪. 广东文化地理[M]. 广东人民出版社, 2001. [47] 龚佩华. 冰清玉洁——广东顺德“自梳女”沧桑[M]//龚佩华. 龚佩华人类学民族学文集/中山大学人类学民族学文丛. 民族出版社, 2003: 136-146. [48] 张在舟. 自梳女与不落家[M]//张在舟. 暧昧的历程: 中国古代同性恋史. 中州古籍出版社, 2001: 769-771. [49] 李燕萍. 对外粤语教学拼音方案的比较与讨论[M]//吴伟平, 李兆麟. 语言学与华语二语教学. 香港大学出版社, 2009: 192. [50] 詹伯慧, 张日升. 珠江三角洲方言调查报告：珠江三角洲方言字音对照[M]. 广东人民出版社, 1987. [51] 彭小川. 粤语韵书《分韵撮要》及其韵母系统[J]. 暨南学报(哲学社会科学版), 1992, (4): 153-159. [52] 郑兆邦. 港澳两地的政府粤语拼音[J]. 中国语文通讯, 2014, 93(1): 27-38. [53] 俞金尧. 近代早期欧洲的“生活周期佣人”研究[M]//陈恒，洪庆明. 世界历史评论. 上海人民出版社, 2014. [54] 俄璐璐. 俄汉全译之换译探析[D]. 黑龙江大学, 2011. [55] 黄忠廉等. 翻译方法论[M]. 中国社会科学出版社, 2009: 67. [56] WILLIAMS S W. C?Ying C?wá C?fan Wano? Tsc?üto? C?iúo?: A Tonic Dictionary of the Chinese Language in the Canton Dialect[M]. Printed at the Office of the Chinese Repository, 1856: vii. [57] 梁绍壬. 《两般秋雨庵随笔》卷八[M]. 上海古籍出版社, 1982: 420. [58] 桂强. 冥婚: 陋习背后的“阴谋”[EB/OL]. http://newspaper.jfdaily.com/xwcb/html/2018-01/11/content_61395.htm, 2018-01-11. [59] 胡朴安. 中华全国风俗志(下)[M]. 岳麓书社, 2013: 657. [60] 番禺县地方志编纂委员会. 番禺县志[M]. 广东人民出版社, 1995: 895. [61] 余婉韶. 自梳女与“守墓清”[M]//刘志文. 广东民俗大观(下). 广东省旅游出版社, 2007: 507. [62] 徐靖捷. 走近西樵自梳女[M]. 广西师范大学出版社, 2012: 37. [63] 杜岳洲. 親屬與婚姻＿Delayed transfer marriage[EB/OL]. https://www.slideshare.net/JouDu/delayed-transfer-marriage, 2016-01-13. [64] 邹玉华. “靓”“靓丽”“亮丽”及其他[J]. 汉语学习, 1997, (6): 16-19. [65] 刘丽媛. 从语言接触谈粤方言词"靓"的借用[J]. 现代语文(语言研究版), 2012, (2): 32-34. [66] HEIM M H, TYMOWSKI A W. Guidelines for the translation of social science texts[M]. American Council of Learned Societies, 2006. [67] WALLERSTEIN I. Concepts in the social sciences: problems of translation[M]//ROSE M G. Translation spectrum: Essays in theory and practice. State University of New York Press, 1981: 89-97. [68] 关于进一步规范出版物文字使用的通知[R]. 新闻出版总署, 2010-12-20. [69] 叶蜚声, 徐通锵. 语言学纲要[M]. 北京大学出版社, 1997: 198-199. [70] 梁其姿. 評介有關珠江三角州婚姻制度的兩種近作[J]. 新史學, 1991, 2(4): 163-168. [71] LEMOINE J. 功能与反抗: 论中国与其周边地区妇女地位[M]//马建钊, 乔健, 杜瑞乐. 华南婚姻制度与妇女地位. 1994: 222-242. [72] 程为坤. 西方学术界的中国妇女与性别研究[J]. 四川大学学报(哲学社会科学版), 2007, No.153(6): 97-109. [73] 片山剛. 第 5 章死者祭祀空間の地域的構造: 華南珠江デルタの過去と現在[D]. 大阪大学, 2002. ﹀
馆藏号：	017/M2018(319)
公开日期：	2018-05-25

基于增强型电子书的音乐剧著作的深度翻译策略研究——以《美国音乐剧的秘密生活》为例.乌天骄

链接

题名：	基于增强型电子书的音乐剧著作的深度翻译策略研究——以《美国音乐剧的秘密生活》为例
作者：	乌天骄
学号：	1501210725
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	李博婷
导师单位：	软件与微电子学院
答辩日期：	2018-05-25
关键字(中文)：	音乐剧著作增强型电子书深度翻译策略美国音乐剧百老汇
文摘：	︿本文是基于美国作者杰克•维埃泰尔（Jack Viertel）的《美国音乐剧的秘密生活：百老汇表演是如何设计的》（The Secret Life of the American Musical: How Broadway Shows Are Built）一书的翻译实践报告。目前国内对音乐剧著作的翻译实践还比较少，相关翻译研究也处于比较边缘的位置且缺乏针对性；考虑到国内音乐剧市场发展仍处于初级阶段、国内核心观众群体尚未形成以及中西方读者之间的巨大差异，简单直接地对原书进行翻译而不阐释、补充书中的专业知识和背景信息容易造成理解障碍。因此，寻找合适的翻译策略来指导音乐剧著作翻译中的问题既有理论意义又有实践意义。笔者通过大量文献研究发现深度翻译策略适用于指导音乐剧著作的翻译实践，通过“深度语境化”补充中国读者缺少的音乐剧专业知识和背景信息，能够帮助读者更好地阅读译文，更好地欣赏美国音乐剧文化。作为一种舞台表演艺术，音乐剧可视听的特点决定了其著作的特殊性。作者在书中多处引用了音乐剧的歌词，然而静态的文字形式很难表现动态的音乐剧表演。因此笔者将音乐剧的视听特点与当前最新颖的增强型电子书（enhanced ebook）技术相结合，利用其支持的图集、音频、视频、弹出项等多媒体和交互形式展现音乐剧本身的特点，同时结合笔者制作的增强型电子译本探讨更丰富的深度翻译策略。如何在为读者扫除理解障碍的同时，不过度使用深度翻译策略，避免读者阅读负担的增加，则需要译者制定满足大多数读者需求和翻译实践的深度翻译原则。为此，笔者对以音乐剧爱好者为主体的译文潜在读者进行了读者调研，通过收集读者反馈来辅助深度翻译原则的确定和深度翻译实践的改进。最后，本文将具体论述适用于音乐剧著作的深度翻译原则，即自主性原则、适度性原则和准确性原则；然后根据增强型电子译本和翻译实践具体分析深度翻译策略，即1）补充图文信息2）使用弹出项3）合理使用多媒体4）添加译者附录和词汇表，以期为外国音乐剧著作的翻译策略和译本展现形式的创新提供借鉴。﹀
分类号：	H087
论文总页数：	217
参考文献数：	70
参考文献：	︿毕艳芳,李泰峰,曹学艳.高校图书馆增强型电子书馆藏建设模式探讨[J].图书情报工作,2014,58(06):75-78. 曹明伦.谈深度翻译和译者的历史文化素养——以培根《论谣言》的三种汉译为例[J].中国翻译,2013,34(03):117-119. 陈然,刘琼.增强型儿童电子书市场扩散的影响因素与策略探析——基于儿童家长的实证研究[J].出版发行研究,2013(10):19-22. 陈媛.余国藩《西游记》英译本中深度翻译的应用[D].华中师范大学,2013. 程三国,马学海.把握电子书产业的发展步伐[J].出版科学,2012,20(02):10-14. 段峰.深度描写、新历史主义及深度翻译——文化人类学视阈中的翻译研究[J].西华师范大学学报(哲学社会科学版),2006,(02):90-93. 郭亚峥.《骆驼从哪里来:关于物种入侵的科学读本》第1章翻译报告[D].内蒙古大学,2015. 韩存齐.无标识增强现实电子书系统研究与实现[D].华中师范大学,2014. 黄小芃.再论深度翻译的理论和方法[J].外语研究,2014,(02):72-76. 蒋辰雪,刘凯.以“深度翻译”理论模式探索中医英译[J].中国中医基础医学杂志,2016,22(11):1542-1544. 金维林.《文字的力量：创造世界的密码》（第6-8章）翻译报告[D].暨南大学,2016. 孔德亚.纽马克交际翻译理论在信息型文本中的应用[D].浙江大学,2015. 乐征帆,徐丽芳.Booktrack:图书与音乐的奇妙之旅[J].出版参考,2016(07):18-19. 李莎莎.《黄金史纲》“译者序”英汉翻译实践报告[D].内蒙古大学,2016. 李晓艳.深度翻译视角下的隐语研究——以《水浒传》法译本为例[J].法国研究,2017,(01):89-94. 李雁.《红楼梦》法译本的“深度翻译”及其文化传递[J].外语教学与研究,2014,(04):616-624+641. 李振华,楼向雄.基于增强现实技术的儿童电子书研究与发展[J].世界科技研究与发展,2017,39(02):189-193. 陆云.增强版电子书成出版商新宠[J].全国新书目,2011,(07):27. 毛秋莉.深度翻译在艺术翻译中的应用[D].浙江大学,2017. 潘敏.《古丝绸之路之行：中亚艺术》翻译实践报告[D].新疆大学,2016. 彭筱.艺术史类文本翻译研究[D].苏州大学,2015. 渠竞帆.英美出版商纷纷首推增强版电子书[N].中国图书商报,2010-08-13(001). 沈慧.《纽约外史》第七部翻译报告—从深度翻译的角度看翻译中的注释[D].山东大学,2016. 宋若水.试析杂糅理论在艺术史翻译中的应用[D].南京大学,2016. 宋晓春.论典籍翻译中的“深度翻译”倾向——以21世纪初三种《中庸》英译本为例[J].外语教学与研究,2014,46(06):939-948+961. 孙锦,王慧君.基于增强现实技术的电子书分析——以“AR涂涂乐”为例[J].数字教育,2017,3(03):61-65. 孙宁宁.翻译研究的文化人类学纬度:深度翻译[J].上海翻译,2010(01):14-17. 王娉.《格罗夫伊斯兰艺术与建筑百科全书》（节选）翻译实践报告[D].宁夏大学,2017. 王岫庐.试论“深描”法对翻译研究的启发[J].中国翻译,2013,34(05):10-15+128. 王颖晖.中国音乐剧市场的发展现状与前景[J].艺术评论,2014(09):59-64. 王珍珍.我国富媒体出版研究[D].河南大学,2016. 王祖皆.中国音乐剧的现状与发展[J].歌剧,2012,(07):50-53. 文军,王斌.《芬尼根的守灵夜》深度翻译研究[J].外国语文,2016,32(01):110-116. 吴昕欣.厚翻译在艺术翻译中的应用[D].浙江大学,2013. 吴瑶,何志武.增强型儿童电子书:新媒体语境下儿童的“阅读”革命[J].出版发行研究,2014(12):54-57. 谢静.交际翻译理论下的《艺术的哲学：美学概论》节选（第五章）翻译报告[D].东南大学,2016. 徐丽芳,陆璐.增强型电子书的发展趋势[J].出版参考,2014(Z1):23-24. 徐文彬,付晓.法律移植视域下的翻译策略[J].东疆学刊,2014,(04):47-52. 徐瑶瑶.面向移动终端的无标识增强现实电子书系统的研究与实现[D].华中师范大学,2015. 许敏.卫礼贤/贝恩斯《周易》英译本的深度翻译研究[J].外语教学理论与实践,2016,(03):78-85. 许山杉.增强现实电子书的开发[D].华东师范大学,2011. 于童.关于艺术作品的翻译实践报告[D].天津师范大学,2014. 张林熹. “深度翻译”与中医方剂的文化意象传递[J].中国中西医结合杂志,2016,36(10):1252-1254. 张潞,司占军,杜胜男.增强型儿童电子书《三字经》的设计与实现[J].电脑知识与技术,2016,12(17):145-147. 张佩瑶.中国翻译话语英译选集（上册）：从最早期到佛典翻译[M].上海：上海外语教育出版社,2010. 章惠.基于沉浸式触觉反馈技术的多媒体增强型电子书探究[J].出版发行研究,2014,(09):49-51. 章艳,胡卫平.文化人类学对文化翻译的启示——“深度翻译”理论模式探索[J].当代外语研究,2011,(02):45-49+62. 赵隽.从深度翻译看陌生化再现[D].天津大学,2014. 赵越.历史类文本翻译的加注研究[D].北京外国语大学,2014. 周思阳.《中国书院史》节选文化负载词翻译实践报告[D].湖南大学,2016. 朱婧婷.电子书发行模式探究[D].复旦大学,2014. Aedo I, Diaz P, Fernandez C, et al. Assessing the utility of an interactive electronic book for learning the Pascal programming language[J]. Education IEEE Transactions on, 2000, 43(4):403-413. Appiah K A. Thick Translation[J]. Callaloo, 1993, 16(4): 808-819. Beer W, Wagner A. Smart books: adding context-awareness and interaction to electronic books[C]// MoMM'2011 - The Nineth International Conference on Advances in Mobile Computing and Multimedia, 5-7 December 2011, Ho Chi Minh City, Vietnam. DBLP, 2011:218-222. Binas M, Stancel P, Novak M, et al. Interactive eBook as a supporting tool for education process[C]// IEEE, International Conference on Emerging Elearning Technologies & Applications. IEEE, 2012:39-44. Colombo L, Landoni M. A diary study of children's user experience with EBooks using flow theory as framework[C]// Conference on Interaction Design and Children. ACM, 2014:135-144. Colombo L, Landoni M, Rubegni E. Design guidelines for more engaging electronic books: insights from a cooperative inquiry study[C]// ACM, 2014:281-284. Ericson B J, Guzdial M J, Morrison B B. Analysis of Interactive Features Designed to Enhance Learning in an Ebook[C]// The Eleventh International Conference. 2015:169-178. Fenwick J B, Kurtz B L, Meznar P, et al. Developing a highly interactive ebook for CS instruction[C]// Proceeding of the, ACM Technical Symposium on Computer Science Education. ACM, 2013:135-140. Hermans T. Cross-Cultural Translation Studies as Thick Translation[J]. Bulletin of the School of Oriental & African Studies University of London, 2003, 66(3):380-389. Huang Y M, Liang T H, Su Y N, et al. Empowering personalized learning with an interactive e-book learning system for elementary school students[J]. Educational Technology Research & Development, 2012, 60(4):703-722. Kevan J. Developing iBooks - A Case Study Teaching Gram-stain Analysis[J]. Cc0 Universal, 2013. Longa N D, Mich O. Do animations in enhanced ebooks for children favour the reading comprehension process?: a pilot study[C]// International Conference on Interaction Design and Children. ACM, 2013:621-624. Lee L S, Ng G W, Ooi J Z, et al. Merging graphic design and multimedia features in digital interactive eBook for tourism purposes[C]// International Conference on Interactive Digital Media. IEEE, 2016:1-5. Macwilliam A. The Engaged Reader[J]. Publishing Research Quarterly, 2013, 29(1):1-11. Morales C. Implications of publishing ebooks on PCs and mobile devices for engineering technology educators[C]//ASEE Annual Conference, Vancouver, Canada. DOI: AC. 2011, 2345. Muhawi I. Towards a Folkloristic Theory of Translation[J]. Translating Others,2006(2):365-379. Reich S M, Yau J C, Warschauer M. Tablet-Based eBooks for Young Children: What Does the Research Say?[J]. Journal of Developmental & Behavioral Pediatrics, 2016, 37(7):585-591. Seymour C. The Future of User-Centric Ebooks[J]. Econtent, 2013, 36(3):8-10. Shuttleworth, Mark & Moira Cowie. Dictionary of Translation Studies[Z]. Shanghai: Shanghai Foreign Language Education Press,2004:170-171. ﹀
馆藏号：	017/M2018(330)
公开日期：	2021-05-25

科幻虚构词的偏离手段及翻译策略——以《神秘博士：耀眼的黑暗》为例.宋雅雯

链接

题名：	科幻虚构词的偏离手段及翻译策略——以《神秘博士：耀眼的黑暗》为例
作者：	宋雅雯
学号：	1501210670
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	李博婷
导师单位：	软件与微电子学院
答辩日期：	2018-05-25
题目(外文)：	Deviations and Translation Strategies of Science Fiction Neologies – A Case Study of Doctor Who: Shining Darkness
关键字(中文)：	科幻虚构词偏离翻译策略异世感
关键字(外文)：	science fiction neologies deviations translation strategies sense of dissimilar world
文摘：	︿随着中国科幻市场的蓬勃发展，科幻翻译的需求日渐增长。一些文化公司为满足市场需求，开始启用非职业译者出身的粉丝译者。然而国内对科幻翻译的研究多集中于宏观层面，忽略了在科幻细节上的翻译指导。例如，国外学者韦斯特福尔和小伊斯塔万发现小说中高频出现的科幻虚构词是构成“科幻之美”的要素之一；但国内对科幻虚构词的翻译研究却处于初始阶段，不仅数量寥寥，而且不乏将其视为文化负载词、文化专有项的研究。因此，本文以“科幻虚构词”为研究主体，探索构成科幻虚构词的偏离手段，并以《神秘博士：耀眼的黑暗》为翻译案例，试提出针对性的翻译原则及翻译策略。本文首先根据科幻作家及学者的著作，合理界定科幻虚构词的内涵。研究过程主要分为两部分。一方面通过文献分析、受众分析及《神秘博士》中文小说的评价词频统计，确立译文的质量要求；依据质量要求，对比分析现有博士系列中文小说的出书译本和网络译本，并针对译本中的存在的共性问题，提出适用于科幻虚构词的翻译原则。第二部分以偏离理论在文学文体及翻译领域的应用为理论依据，创造性地将该理论应用在科幻翻译中，阐述科幻虚构词与科幻文体之间的关系。然后根据利奇对偏离手段的八种分类，总结多部科幻小说中科幻虚构词的偏离手段，并在前人提出的直译、意译等翻译策略上，进一步提出更细化、更具针对性的翻译策略。研究发现，科幻虚构词的翻译存在五大共性问题：不译、错译、不一致、过度普通化或超自然化，以及词源联想不足。针对过度普通化和超自然化现象，本文提出异世感原则；对词源联想不足，提出信息量对等原则；对错译和不一致现象，分别提出准确性和一致性原则。文章还通过译者访谈和实际案例，阐述这四项原则的内涵及使用场景。在科幻虚构词的偏离研究中，本文发现书写偏离、词汇偏离和语义偏离是最为常见的。其中词汇偏离在前人发现的缩略词、拼缀词和派生词的基础上，增加了词法偏离、无意义音节、低频搭配。重点对无意义音节构成的科幻虚构词进行探讨，总结出神话、外来语和语音三种联想方式，以弥补词源联想不足；并提出借用典故法、字音暗示法和字义暗示法三种翻译策略，使译文符合信息量对等原则。针对书写偏离，提出直译加引和添加类词缀的翻译策略；词法偏离、低频搭配手段下的科幻虚构词，可采用核心词替换策略、拼接策略和四字格创译策略。此外，本文还统计中文科幻小说中科幻虚构词的字数分布，对译文过长的缩略词、拼缀词提出含义压缩策略。﹀
文摘（外文）：	︿ The science fiction market of China has received extensive attention in recent years, which brings the demand for Science fiction translation. However, domestic studies on Sci-Fi translation focused more on macro aspects, such as overall translation qualities or domestication/foreinization. As for micro level, like translation studies on science fiction neologies – which, according to Westfahl and Istvan, is a major feature of this genre – are still in preliminary stage, as many of them treated the nologies as cultural-loaded words. Therefore, this paper explores the Sci-Fi neologies’ deviations, translation principles and translation strategies based on a case study of Doctor Who: Shining Darkness. This paper first defines the meaning of Sci-Fi neologies on the works of Sci-Fi writers and scholars. Secondly, the comparative analysis of current translation of other Doctor Who novels leads to the proposal of translation principles, which are the principle of sense of dissimilar world, the principle of consistency, of equal information load, and of accuracy. Thirdly, with Leech’s categorization of deviations, this paper finds that writing deviation, lexical deviation, and semantic deviation are frequently used in Sci-Fi neologies, with morphological-deviated words, meaningless syllables and low-frequency compoundings as the sub-categories of lexical deviation. In terms of translation strategies, this paper proposes translation via allusions, character pronounciation, and word meaning for meaningless syllables; core word replacing strategy, splicing strategy, and four-character creative translation for morphological-deviated words and low-frequency compoundings. Strategies like literal translation plus quotation mark and quasi-affixes addition can be used to translate writing deviated neologies, and meaning contraction for Sci-Fi neologies that have complex implication and lengthy translation. ﹀
分类号：	H087
论文总页数：	224
参考文献数：	39
参考文献：	︿ Csicser-Ronay I. The Seven Beauties of Science Fiction [J]. Science Fiction Studies, 1996, 23(3):385-388. Westfahl G. The Words that Could Happen: Science Fiction Neologisms and the Creation of Future Worlds [J]. Extrapolation, 1993, (34)4:290-304. Aixelá J. F. Culture-specific Items in Translation [A]. In R. Alvarez, & M. C. Vidal (eds). Translation, Power, Subversion [C]. Clevedon: Multilingual Matters, 1996: 52-78. Meyers W. E. Aliens and Linguists [M]. Athens: University of Georgia, 1980. Card O. S. How to Write Science Fiction & Fantasy [M]. Writers Digest Books, 2001:37-85. 郭建中. 科普与科幻翻译：理论、技巧与实践 [M]. 北京: 中国对外翻译出版公司, 2005:124-222. Tulloch J., Jenkins H. Science Fiction Audiences: Watching Doctor Who and Star Trek [M]. London: Routledge, 1995:68-85. Friedman S. Language as a Familiar Alien in Science Fiction or, as Riddley Walker Would Ask, Wie Wood Eye Both Err Two Reed This? [D]. University of Michigan, 2009. Leech G. N., Short M. H. Style in Fiction: A Linguistic Introduction to English Fictional Prose [M]. Landon&New York: Longman, 1981:22-43. 胡壮麟. 理论文体学 [M]. 北京: 外语教学与研究出版社, 2000. 赵轮江. 诗歌语言的前景化现象分析 [D]. 黑龙江大学, 2008:3. 崔海光. 前景化概念与文学问题分析 [J]. 北京大学学报（国内访问学者、进修教师专刊）, 2006:123-128. Mukarocsky J. Standard Language and Poetic Language [A]. In Ed. Donald and C. Freeman (ed.). Linguistics and Literary Style [C]. New York: Holt, Rinehart and Winston, 1970:40-56. Halliday M. A. K. Linguistic Function and Literary Style: An Inquiry into the Language of William Golding’s ‘The Inheritors’ [A]. In D. C. Freeman (ed). Essay in Modern Stylistics [C]. London: Methuen, 1981:325-360. Leech G. A Linguistic Guide to English Poetry [M]. London: Longman, 1969:37-52. Short M. Exploring the Language of Poems, Plays and Prose [M]. London: Longman, 1996:43. 叶子南. 高级英汉翻译理论与实践 [M]. 北京: 清华大学出版社, 2001:112-119. 李靖民, 徐淑华. 前景化与翻译批评 [J]. 西安外国语学院院报, 2003, 11(1):1-2. 黄春梅. 偏离视角中的刘姥姥形象塑造——评霍克思版刘姥姥语言翻译 [J]. 重庆交通大学学报（社会科学版）, 2014(2):134-137. 孙婷婷. 小说前景化语言的翻译——以《生死疲劳》为例 [D]. 中央民族大学, 2011. 赵速梅, 宫经理. 论前景化理论与小说文本翻译研究 [J]. 外语学刊, 2007(2):128-132. 李嘉怡. 浅析科幻小说翻译过程中的读者影响力——以小说《银河系漫游指南》的汉译为例[J]. 海外英语, 2013(10):145-150. 耿倩. 比较《哈利波特》两种译本文化负载词的翻译 [D]. 南开大学, 2005. 刘洁. 读者本位关照下的文化专有项翻译——以《哈利波特与魔法石》中译本研究 [D]. 湖南大学, 2007. 姚望. 走进衍生宇宙 [D]. 华东师范大学, 2009. 姚望. 用陌生化理论看《星球大战》科幻小说中虚构词的翻译 [J]. 安徽文学, 2009(3):88-89. Roberts G. Doctor Who: Only Human. London: BBC Books, 2005. 中译本: 《人类唯一》, 施然译. 北京: 新星出版社, 2017. 支之. 科幻虚构词与特殊表达法的翻译研究——以科幻小说《深渊上的火》为例 [D]. 西南大学, 2011. 姜倩. 英语科幻小说中的新词及其汉译探析 [J]. 上海翻译, 2012(3):55-59. Russel G. Doctor Who: Beautiful Chaos. London: BBC Books, 2008. 中译本: 《美丽的混沌》, 王爽译. 北京: 新星出版社, 2018. Abnett D. Doctor Who: The Silent Stars Go BY. London: BBC Books, 2011. 中译本: 《寂静星辰飞过》, 徐明晨译. 北京: 新星出版社, 2018. 韩江洪. 切斯特曼翻译规范论介绍 [J]. 外语研究, 2004(2):44-47. Petzold D. Fantasy Fiction and Related Genres [J]. Modern Fiction Studies, 1986 32(1):11-20. Rose M. Alien Encounters: Anatomy of Science Fiction [M]. Cambridge: Harvard UP, 1981. Tolkien J. R. R. Guide to the Names in The Lord of the Rings [A]. In J. Lobdell (ed). A Tolkien Compass [C]. New York: Ballantine, 1975:168-216. 邹晓玲. 现代汉语新兴类词缀探析 [D]. 华中科技大学, 2006. Prucher J. Science Fiction Citations [EB/0L]. [2009-12-10]. http://jessesword.com/sf/view/2133 王寅. 认知语言 [M]. 上海: 上海外语教育出版社, 2007:252-253. Chesterman A. Translation Typology [A]. In A. Veisbergs and I. Zauberga (eds). The Second Riga Symposium on Pragmatic Aspects of Translation [C]. Riga: University of Latvia, 2005:49-62. ﹀
馆藏号：	017/M2018(415)
公开日期：	2021-05-25

复杂历史文本翻译中背景知识图的设计与应用——以《成吉思汗的宗教思想：世界征服者给予我们的宗教自由》翻译为例.刘珈池

链接

题名：	复杂历史文本翻译中背景知识图的设计与应用——以《成吉思汗的宗教思想：世界征服者给予我们的宗教自由》翻译为例
作者：	刘珈池
学号：	1401210646
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	李博婷
导师单位：	软件与微电子学院
第二导师姓名：	张宏岩
第二导师单位：	软件与微电子学院
答辩日期：	2018-05-25
题目(外文)：	The Design and Application of Background Knowledge Infographics in Complex Historical Text Translation
关键字(中文)：	复杂历史文本背景知识翻译过程信息图示
关键字(外文)：	complex historical text background information translation process infographic
文摘：	︿复杂历史文本因时间、空间变量大，导致不同历史时期，同一概念所指代的意义发生变化，对译者的理解和翻译造成障碍。仅靠抽象思维能力难以抓住概念变化的原因，因而译者无法确定概念词汇在不同历史时期和宗教教义下的恰当翻译。在翻译过程的实证研究中，研究者运用系列技术手段和研究方法，对译者的翻译过程进行描述。但前人的研究仍有一些不足，一方面，少有研究者提出可以帮助译者解决翻译过程中存在问题的可行方法；另一方面，以往的研究方法在一定程度上受文本长度、文本复杂性和适用性等因素制约，不适用于复杂历史文本的翻译过程研究。故而笔者在翻译实践中，针对翻译过程中的概念认知和概念词汇翻译问题，提出一套可以帮助译者解决概念认知与翻译困难的方法。本文使用文献研究法，结合《成吉思汗的宗教思想》一书的翻译实践，提出了绘制背景知识组图并与相应概念词汇建立索引的方法，用以解决翻译中的概念认知和概念词汇翻译的困难。具体方法是用信息图示对作为翻译背景信息的时空和宗教信息进行表征，再通过分析中文概念和作者观点了解不同概念在各历史时期的词汇表达，然后富裕民族政权概念词汇和宗教概念词汇唯一编码，最后建立词汇与图组间的索引关系。复杂历史文本翻译中，背景知识图的设计与应用能够作为译者翻译认知过程的辅助手段，帮助译者对翻译文本中变化的概念有更准确的认识，利用构建背景知识组图和动态概念词汇索引的方法，让高频概念词汇翻译的准确性和统一性得到提升。﹀
文摘（外文）：	︿ The complex historical texts have large time and space variables, which lead to changes in the meaning of the same concept word in different historical periods and create difficulties in understanding and translating. It is difficult to figure out reasons of conceptual changing by solely abstract thinking. Therefore, translators cannot find appropriate translations for concept words in different historical periods and different religions. Although the researchers use a series of techniques and research methods to study the translation process, limitations still exist. On the one hand, few researchers propose feasible methods that can help translators solve the problems in the translation process. On the other hand, previous research methods are to some extent affected by the length and complexity of the text. The previous research methods could not apply to the study of the translation process of complex historical texts. Therefore, in the translation practice, the author proposes a method which could help the translator to solve the problem of conceptual cognition process and concept words translation. This article uses the literature research method, combined with the translation practice of Genghis Khan’s Quest for God, proposes a method of drawing background knowledge infographics and indexing the corresponding concept words to solve difficulties in the concept cognition process and translation. The making of background knowledge infographics put forward here consists of four steps. Firstly, making background historical and religious information into maps. Sceondly, analysis the Chinese concept expression of different historical periods and author's point of view. Thirdly, give the only code to every word of ethnic, nation and religion concepts. Fourthly, creat index of words. In the translation of complex historical texts, the design and application of background knowledge infographics are helpful in cognitive process and translation. Giving a more accurate understanding of the concept words’ changing meaning, the method improves the accuracy and consistency of high-frequency concept words translation. ﹀
分类号：	H31
论文总页数：	214
参考文献数：	60
参考文献：	︿ [1] Munslow A. Narrative and History[J]. Studies in the Early Middle Ages, 2007(3):455-473. [2] 何兆武. 历史是什么?[J]. 清华大学学报:哲学社会科学版, 2009(5):163. [3] 罗志田. 往昔非我:训诂、翻译与历史文本解读[J]. 文艺研究, 2010(12):65-73. [4] 傅斯年.傅斯年全集（四）[D]. 湖南教育出版社, 2003:317-323. [5] 孙岳. 历史文本的翻译问题——以《独立宣言》的汉译过程为例[J]. 首都外语论坛, 2007. [6] 黄雪琴. 关于历史文本翻译的个案考察[D]. 广东外语外贸大学, 2015. [7] 查干, 陈先贵. 史学文本中的翻译问题初探——以翻译美国历史课本为例[J]. 边疆经济与文化, 2016(9):89-91. [8] 刘宏伟. 历史类文本中的理解障碍及翻译技巧[D]. 山东师范大学, 2016. [9] 林海. 平行文本在《中国历史》(节选)翻译中的应用[D]. 山东师范大学, 2016. [10] 赵越. 历史类文本翻译的加注研究[D]. 北京外国语大学, 2014. [11] 姜思雯. 功能研究视角下历史类文本的翻译[D]. 兰州大学, 2013. [12] 潘文国, 翻译过程研究的重要成果——序郑冰寒《英译汉过程中选择行为的实证研究》[J]. 山东外语教学, 2012, 33(2):90-92. [13] Toury G. Descriptive Translation Studies and beyond[C]// Society. 1995:23-39. [14] Eugene A. Nida & Charles R.Taber. The Theory and Practice of Translation[J]. 1969// Leiden: E.J.Brill. Proshina, Z. Theory of Translation (English and Russian). Vladivostok Far Eastern University Press. 2008. [15] D. Seleskovitch. Interpretation: A psychological approach to translating[J]. Translation Studies, 1976(3):455-473. [16] Bell R T, Candlin C. Translation and Translating: Theory and Practice (Applied Linguistics and Language Study)[J]. 1991. [17] 蒋骁华, 近十年来西方翻译理论研究[J]. 外语教学与研究:外国语文双月刊, 1998(2):31-36. [18] Krings H. Repairing. Translation problems and translation strategies of advanced German learners of French (L2)[J]. Interlingual and intercultural communication, 1986: 263-276. [19] Bernardini S. Using think-aloud protocols to investigate the translation process: Methodological aspects[J]. University of Bologna, Bologna, 1999. [20] 王俊超, 曾利沙. 西方翻译过程研究五十年述评——一项基于核心文献的多维剖析[J]. 广东外语外贸大学学报, 2015,26(6):69-74. [21] Ericsson K A, Simon H A. Protocol analysis: Verbal reports as data[J]. Journal of Marketing Research, 1993, 23(3). [22] 王寅. 认知翻译研究:理论与方法[J]. 外语与外语教学, 2014(2):1-8. [23] Jakobsen A L. Logging target text production with Translog[J]. Copenhagen studies in language, 1999 (24): 9-20. [24] Daniel, Gile D. Basic Concepts and Models for Interpreter and Translator Training [M]. Shanghai: Shanghai Foreign Language Education Press, 2001: 121. [25] Newmark, Peter. A Textbook of Translation[J]. Foreign Language Teaching & Research, 1989(3):341-342. [26] 穆凤良, 许建平. 源语意图的识别与翻译——关于翻译的文化因素思考[J]. 中国翻译, 2001(4):35-38. [27] 刘和平. 译前准备与口译质量——口译实验课的启示[J]. 语文学刊, 2007(4):73-76. [28] 盛丹丹. 口译学习者知识能力的习得及评估——基于概念图途径的研究[J]. 上海翻译, 2016(2):47-52. [29] 孙妮. 背景知识对翻译过程和翻译效果的影响[D]. 上海外国语大学, 2014. [30] 陈潇远. 论专业翻译中的逻辑变通意识与策略[D]. 上海外国语大学, 2012. [31] 廖七一. 当代西方翻译理论探索[M]. 译林出版社, 2006.137-138 [32] Grunig L S. 442 pp.Doug Newsom and Bob Carrell, Public Relations Writing: Form & Style (2nd edition), Wadsworth, Belmont, Calif. (1986).[J]. Bl Rlaon Rvw, 1986(2):59. [33] Sears A, Jacko J A. Human-Computer Interaction: Design Issues, Solutions, and Applications[C]// International Conference on Emerging Trends in Engineering and Technology. IEEE, 2009:204–205. [34] Aparicio, Manuela, Costa, et al. Data visualization[J]. Communication Design Quarterly Review, 2014, 3(1):7-11. [35] 刘润峰, 陈静. 信息图示在新闻视觉化传播中的运用[J]. 青年记者, 2014(14):39-40. [36] Beauchamp G K. The visual display of quantitative information : Edward R. Tufte. Graphics Press: Cheshire, CT, 1983. 197 [J]. Optica Acta International Journal of Optics, 1991, 17(3):263-263. [37] Jonassen D H. Hypertext as Cognitive Tools[M]// Cognitive tools for learning. Springer, 1992:147-148. [38] Novak, J. D. & A. J. Ca?as, The Theory Underlying Concept Maps and How to Construct and Use Them, Technical Report IHMC CmapTools 2006-01 Rev 01-2008, Florida Institute for Human and Machine Cognition, 2008. [39] 来梦娜. 论“中国”含义的历史演变[J]. 黑龙江史志, 2015(1). [40] 谭其骧. 历史上的中国和中国历代疆域[J]. 中国边疆史地研究导报, 1988 (3) :1-9 [41] 陈连开. 论中国历史上的疆域与民族[J]. 中央民族学院学报, 1981 (4) :44-51 [42] 周伟洲. 历史上的中国及其疆域、民族问题[J]. 云南社会科学, 1989 (2) :50-56 [43] 都永浩. 辛亥革命前后的“中华民族”概念[J]. 中国边疆史地研究, 2012(3):1-9. [44] 杨思机. “少数民族”概念的产生与早期演变——从1905年到1937年[J]. 民族研究, 2011(3):1-11. [45] 乔峙鹏, 青觉. 历史语境下中华民族的概念与启示[J]. 黑龙江民族丛刊, 2017(4):9-14. [46] 薛宗正. 吐蕃、回鹘、葛逻禄的多边关系考述--关于唐安史乱后的西域角逐[J]. 西域研究, 2001(3):7-20. [47] 汉密尔顿. 五代回鹘史料[M]. 新疆人民出版社, 1986. [48] 巴哈提?依加汉. 840年后迁往金山—也儿的石河流域的回鹘人[J]. 新疆大学学报(哲学?人文社会科学汉文版), 1991(3):64-70. [49] 程溯洛. 回纥游牧封建汗国的兴衰(744—840)[J]. 西北民族研究, 1990(2):169-180. [50] 呂大吉. 宗教学通论新编[M]. 中国社会科学出版社, 1998:73. [51] 金宜久. 对伊斯兰教的几点认识[J]. 世界宗教研究, 1998(2):93-101. [52] 王怀德. 伊斯兰教教派(一)[J]. 阿拉伯世界研究, 1982(1):87-92. [53] 殷小平. 唐元景教关系考述[J]. 西域研究, 2013(2):51-59. [54] 黄子刚. 元代基督教研究[D]. 暨南大学, 2004. [55] 杨富学. 高昌回鹘摩尼教稽考[J]. 敦煌研究, 2014(2):127-137. [56] 马小鹤. 明教“五佛”考——霞浦文书研究[J]. 复旦学报(社会科学版), 2013, 55(3):100- 114. [57] 杨绍猷. 蒙古族的早期信仰和成吉思汗的宗教政策[J]. 民族研究, 1983(1):45-53. [58] 乌兰. 蒙古族腾格里信仰研究[D]. 中央民族大学, 2017. [59] 刘念业. “God”汉语译名之嬗变——兼论晚清《圣经》汉译活动中的“译名之争”[J]. 外国语文, 2015, 31(4):116-122. [60] 张政, 胡文潇. 《论语》中“天”的英译探析——兼论其对中国文化核心关键词英译的启示[J]. 中国翻译, 2015(6):92-96. ﹀
馆藏号：	017/M2018(427)
公开日期：	2021-05-25

面向海外粉丝型受众的国产剧字幕反常化翻译策略研究.李梅娟

链接

题名：	面向海外粉丝型受众的国产剧字幕反常化翻译策略研究
姓名：	李梅娟
学号：	1501210585
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2018-05-25
外文题名：	Abusive Subtitling Strategies of Chinese Dramas Based on Overseas Fans' Needs
关键词：	国产剧字幕翻译海外粉丝需求分析反常化策略
外文关键词：	Chinese dramas overseas fans abusive subtitling needs analysis
论文摘要：	︿自 2014 年年底开始，中国内地电视剧、网剧（以下简称国产剧）在 YouTube 的搜索指数开始超越日剧和台剧，吸引了众多海外观众，甚至催生了一批热衷于翻译国产剧的海外粉丝型字幕组。国产剧海外传播方兴未艾，然而，国产剧字幕英译现状却不尽如人意，过度删减、归化、篡改等翻译特点引起了海外粉丝型受众的不满。遗憾的是，此现象尚未引起学术界的广泛关注，国内翻译界对字幕翻译的研究也较少涉及对受众需求的分析。针对这一研究空缺，本文以诺恩斯的反常化翻译策略为指导，基于国产剧海外粉丝型受众对字幕翻译的需求，针对国产剧中的习语典故、流行语、文字游戏3个翻译难点，提出国产剧字幕反常化翻译的5项原则和6个策略，并通过受众满意度评分验证了策略的有效性。具体而言，本文的研究问题为国产剧海外粉丝型受众对字幕翻译的需求是怎样的、怎样的翻译策略符合粉丝型受众的需求、这些翻译策略是否行之有效。研究方法上，本文综合了网络民族志、访谈、问卷调查、文献分析等多种方法。前期，笔者参与了10 余部国产剧的英译工作，访谈译者 10 人、海外粉丝 20 人，通过翻译实践和访谈，分析国产剧的语言特点、翻译难点和当前字幕的满意度；在实践和访谈的基础上，设计调查问卷，面向国产剧海外粉丝群体发放问卷，有效回收106份，对受众需求进行定量描述；基于受众需求，本文提出了若干翻译策略，并以译例分析的形式加以论证；最后，本文通过受众评分的方式对翻译策略进行有效性验证。研究发现，海外粉丝型受众最喜爱的国产剧类型为古装奇幻剧和青春偶像剧，其翻译难点为习语典故、流行语和文字游戏；海外粉丝型受众期待译文的本真性和易理解性，认为传统规范中要求的简洁性和自然性无关宏旨。因此，本文认为常规的字幕翻译惯例与粉丝需求并不相符，应该采用反常化的翻译策略以适应粉丝需求。本文提出了本真性、解释性、无植入性、友好性和常用性五个翻译原则，以及“三多两少一换”的翻译策略，具体有括号加注、增译语义、避免使用西方特有的文化元素、删改歧视性信息、替换流行语五个语言性策略以及变换字体样式的视觉性辅助手段。策略验证结果表明，采用反常化翻译策略的字幕译文得分比采用传统惯例的译文得分更高，粉丝满意度更高；。本文从翻译的角度对国产剧的字幕英译进行了规定性研究。希望本文可以启发和鼓励更多的翻译界、电视剧艺术界和传播界学者关注和研究国产剧的海外传播。﹀
外文摘要：	︿ Chinese mainland dramas (hereinafter, C-dramas) have attracted a large number of overseas fans in the past five years. Since many of the fans can’t read Chinese, they have to rely on the English subtitles translated either by paid professional translators or unpaid amateur translators. However, be it professional or amateur, the English subtitles are far from satisfactory. Many overseas fans complain about the information loss, over Americanization and improper adaptation. Unfortunately, few studies concerning the relationship between the audience’s needs and subtitling strategies have been done. This paper proposes six abusive subtitling strategies after a comprehensive study of the overseas fans of C-dramas. Specifically, this paper aims to answer the following four questions: Who are the C-drama fans? What are their expectations and needs in term of subtitling? What kind of subtitling strategies suit their needs? Do these strategies work? This paper conducts several interviews and a large-scale survey in order to analyze the fans’ profiles, viewing preferences, viewing motivations and their expectations for the English subtitles. It is found that authenticity and understandability are the top two priorities for the overseas fans. In order to maximize the two principles, this paper proposes six abusive subtitling strategies, including both graphic and linguistic strategies, namely, literal translation with notes, additional translation, substitution, deletion or modification of discriminatory content and no western-specific references. At last, this paper also conducts a verification experiment of the six strategies, which proves them effective. This paper has provided a prescriptive study about how to make better subtitles. Future studies can account for the phenomenon from cross-cultural communication, sociology, TV art and more. ﹀
分类号：	H087
论文总页数：	100
参考文献总数：	43
参考文献列表：	︿ [1] Brzeski P. Netflix acquires Chinese detective drama series 'Day and Night'[N/OL]. Hollywood Reporter, 2017-11-30 [2018-1-20] https://www.hollywoodreporter.com/news/netflix-acquires-chinese-detective-drama-series-night-day-1062954. [2] Yuan Y. Subtitling Chinese cinema: A case study of Zhang Yimou’s films[D]. University of Glasgow, 2016. [3] Kozinets R. V. Doing ethnographic research online[M]. London: SAGE Publications Ltd, 2009. [4] 亨利·詹金斯. 本文盗猎者:电视粉丝与参与式文化[M]. 郑熙青, 译. 北京：北京大学出版社, 2016. [5] Zhang W, Mao C. Fan activism sustained and challenged: participatory culture in Chinese online translation communities[J]. Chinese Journal of Communication, 2013, 6(1): 45-61. [6] 陶东风. 粉丝文化读本[M]. 北京：北京大学出版社, 2009. [7] Samra B, Wos A. Consumer in sports: fan typology analysis[J]. Journal of Intercultural Management, 2014, 6(4): 263-288. [8] Jiménez-Crespo M. A. Crowdsourcing and Online Collaborative Translations[M]. Amsterdam: John Benjamins Pub, 2017. [9] 韩江洪, 李怡嘉. 译者主体性与粉丝型受众意识的距离分析[J]. 合肥工业大学学报(社会科学版), 2013, [10] 尹玉珺. 粉丝文化对翻译的影响——以欧美圈为例[D]. 北京大学, 2016. [11] Koper L. Subtitling TV series: a corpus-based study of amateur and professional subtitles to Sherlock and Lie to Me[D]. University of Warsaw, 2017. [12] Manchón P. G. A corpus-based analysis of swearword translation in DVD subtitles and Internet fansubs[D]. Universidad Complutense Madrid, 2013. [13] 何艾玲. 中国民间自发与官方正式翻译团体的翻译对比研究[D].北京外国语大学,2017. [14] Nornes A. M. For an abusive subtitling[J]. Film Quartley, 1999, 52(3): 17-34. [15] Nornes A. M. Cinema Babel: Translating Global Cinema[M]. Minneapolis: University of Minnesota Press, 2007. [16] 苏状. “可见的”字幕——亚伯·马尔库斯·诺尼斯教授“abusive”电影翻译研究访谈[J]. 世界电影, 2015(3):147-156. [17] 王东风. 译学关键词：abusive fidelity[J]. 外国语, 2008, 31(4):73-77. [18] González, L.P. Fansubbing anime: Insights into the ‘butterfly effect’ of globalization on audiovisual translation[J]. Perspectives, 2007, 14(4): 260-277. [19] De Linde Z, Kay N. The Semiotics of Subtitling[M]. New York: Routledge, 1999. [20] 钱绍昌. 影视翻译──翻译园地中愈来愈重要的领域[J]. 中国翻译, 2000(01): 61-65. [21] Massidda S. Audiovisual Translation in the Digital Age: The Italian Fansubbing Phenomenon[M]. London: Palgrave McMillan, 2015. [22] González, L.P. Amateur subtitling and the pragmatics of spectatorial subjectivity[J]. Language and Intercultural Communication, 2012 (12): 335-352. [23] Díaz-Cintas, J. & Mu?oz Sánchez, P. Fansubs: audiovisual translations in an amateur environment[J]. Journal of Specialised Translation, 2006, 6: 37-52. [24] Lee H. Cultural consumers and copyright: A case study of anime fansubbing[J]. Creative Industries Journal, 2010(33), 235-250. [25] Dwyer, T. Fansub dreaming on Viki[J]. The Translator, 2012. 18: 217-243. [26] Dwyer, T. Multilingual publics: fansubbing global TV // P.D. Marchall et al., Contemporary Publics: Shifting Boundaries in New Media, Technology and Culture. London: Palgrave Macmillan, 2016: 164-185. [27] Dwyer T. Speaking in Subtitles: Revaluing Screen Translation[M]. Edinburgh: Edinburgh University Press, 2017. [28] 何晓燕, 全球化语境下中国电视剧的跨文化传播研究[D]. 中国艺术研究院, 2012. [29] 宗倩倩. 中国大陆电视剧在东南亚的传播研究[D]. 浙江大学, 2014. [30] 黄会林. 2012中国电影国际传播年度报告[M]. 北京：北京师范大学出版社, 2013. [31] 谭慧. 关于中国电影对外翻译理论研究——以电影《狼图腾》的翻译为例[J]. 北京电影学院学报, 2016(1):148-153. [32] 郭胜群. 从目的论角度看《画皮》系列电影的字幕翻译[D]. 北京第二外国语学院, 2013. [33] 毛玉晴. 功能理论视角下美版《甄嬛传》字幕英译研究[D]. 福建师范大学, 2016. [34] 胡梦莹. 《琅琊榜》也只能卖白菜价？深度调查国剧海外发行[N/OL]. 澎湃新闻, 2016-03-23 [2018-04-05]. https://www.thepaper.cn/newsDetail_forward_1447371_1 [35] 邓平博. 中国网络玄幻小说海外译介研究[D]. 北京大学, 2017. [36] Folch C. Why the West loves Sci-Fi and Fantasy: A Cultural Explanation[N/OL]. The Atlantic, 2013-06-13 [2018-04-06]. https://www.theatlantic.com/entertainment/archive/2013/06/why-the-west-loves-sci-fi-and-fantasy-a-cultural-explanation/276816/ [37] Katz E, Blumler J G, Gurevitch M. Uses and Gratifications Research[J]. Public Opinion Quarterly, 1973, 37(4): 509–523. [38] Karamitroglou F. A proposed set of subtitling standards in Europe[J]. Translation Journal, 1998, 2(2): [39] Pedersen, J. The FAR model: assessing quality in interlingual subtitling[J]. The Journal of Specialised Translation, 2017(28): 210-229. [40] 张双棣. 《淮南子校释》(增订本)[J]. 哲学门, 2013(2). [41] 渠红岩. 中国古代文学桃花题材与意象研究[D]. 南京师范大学, 2008. [42] 谢之君, 杨月华. 网络流行语认知价值及翻译[J]. 上海翻译, 2015(3): 23-27. [43] 王显志. 英汉语性别歧视现象的对比研究[D]. 中央民族大学, 2010. ﹀
馆藏号：	017/M2018(473)
公开日期：	2018-05-25

技术写作在分布式敏捷开发中的沟通管理研究–以K公司为例.李慧敏

链接

题名：	技术写作在分布式敏捷开发中的沟通管理研究--以K公司为例
作者：	李慧敏
学号：	1501210582
语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	李博婷
导师单位：	软件与微电子学院
第二导师姓名：	张宏岩
答辩日期：	2018-05-25
关键字(中文)：	技术写作分布式敏捷沟通管理
文摘：	︿传统开发环境下，文档工程师依靠项目前期详尽的产品规格文档和面对面沟通全面了解产品。随分布式敏捷开发的不断推广，项目环境愈发复杂，详细的前期输入文档与面对面沟通无法实现，文档工程师面临巨大的沟通挑战。而目前国内外针对技术写作沟通管理研究非常少。研究技术写作团队在分布式敏捷开发中的沟通问题和有效沟通策略具有理论和实际意义。本文以实习经历为基础，结合访谈和观察分析，梳理总结出十二个具体案例还原技术写作团队在分布式敏捷开发中内部的沟通问题以及与绘图工程师和生产研发工程师的沟通挑战。研究发现沟通问题包括信息不对称、沟通对象不明确、信息编码困难、信息解码困难和反馈时间长的问题。通过对技术写作团队和生产研发团队进行问卷调查和访谈分析得出沟通问题产生的原因，包括公司层面的沟通层级和沟通支持因素，项目层面的敏捷开发、分布式和产品特点因素以及个人层面的知识差异和沟通媒介选择因素。在沟通问题和原因分析的基础上，基于观察总结和理论总结，提出引入知识管理提高沟通效率的策略。然后从文档修改幅度、文档工程师的工作时间和工作心态三方面对沟通策略的效果进行验证。期望对分布式敏捷开发中技术写作工程提供沟通经验和建议。﹀
分类号：	C939
论文总页数：	75
参考文献数：	55
参考文献：	︿储节旺. (2006). 知识管理概论. 清华大学出版社. 陈劲, 童亮, & 景劲松. (2004). 中国企业R&D国际化的沟通机制研究.科研管理, 25(1), 77-83. 经济合作与发展组织编. (1997). 以知识为基础的经济. 机械工业出版社. 金坤. (2010).敏捷开发环境下英文用户文档开发研究. 硕士学位论文, 北京大学. 康青, & 蔡惠伟. (2009).管理沟通教程. 立信会计出版社. 刘茂盛, & 周章城. (2008). 基于沟通模型的虚拟企业沟通问题探究. 企业家天地下半月刊:理论版(5), 42-43. 刘宇, 张成洪, & 古晓洪. (2002). 企业门户:组织新型知识管理工具. 中国管理科学, 10(z1), 401-405. 马颜, & 李晓轩.(2004). 虚拟团队中的信任研究.心理科学进展, 12(2), 273-281. 敏捷开发宣言,(2001),http://agilemanifesto.org/iso/zhchs/manifesto.html 苗菊, & 高乾. (2010). 构建mti教育特色课程——技术写作的理念与内容.中国翻译(2), 35-38. 潘旭伟, 顾新建, 邱进冬, & 仇元福. (2003). 知识管理工具. 中国机械工程,14(5), 413-417. 邱晖, & 孙政顺. (2001). 知识管理系统的构建及其策略. 计算机工程与应用,27(1), 52-54. 王传英.王丹.(2011). 技术写作与职业翻译人才培养[J]. 解放军外国语学院学报. 王久华. (1991).企业技术进步手册. 科学出版社. 谢新. (2004). 知识管理研究浅析. 高校图书情报论坛 (3), 37-38. 左美云. (2000). 国内外企业知识管理研究综述. 科学决策(3), 31-37. Abdullah, F. (2008). A framework for efficient use of electronic communication tools for knowledge transfer for agile offshore development of financial software applications in India. Robert Morris University. Alzoubi, Y. I., Gill, A. Q., & Al-Ani, A. (2016). Empirical studies of geographically distributed agile development communication challenges: a systematic review. Information & Management, 53(1), 22-37. Ambler, S. (2002). Agile Modeling: Effective practices for eXtreme programming and the unified process. New York, NY: John Wiley & Sons. Baheti, P., Williams, L., Gehringer, E., Stotts, D. and Smith, J.M. (2002), “Distributed pair programming: empirical studies and supportive environments”, technical report TR02‐010, Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, March. Carlson, J. R., & Zmud, R. W. (1995). Channel expansion theory and the experiential nature of media richness perceptions. Academy of Management Journal, 42(2), 153-170. Carmel, E. (1997). Thirteen assertions for globally dispersed software development research. Hawaii International Conference on System Sciences: Information System Track-Organizational Systems and Technology (Vol.3, pp.445). IEEE Computer Society. Cruzes, D. S., Moe, N. B., & Dyb?, T. (2016). Communication between Developers and Testers in Distributed Continuous Agile Testing.IEEE, International Conference on Global Software Engineering (pp.59-68). IEEE. Daft, R. L., & Lengel, R. H. (1983). Information Richness. A New Approach to Managerial Behavior and Organization Design. Research in Organizational Behavior (Vol.6, pp.191-233). Davenport, T. H., Prusak, L. (1999). Working Knowledge: How Organizations Manage What They Know. Harvard Business School Press. Dennis, A. R., & Valacich, J. S. (1999). Rethinking Media Richness: Towards a Theory of Media Synchronicity. Hawaii International Conference on System Sciences (Vol.track1, pp.1017). IEEE Computer Society. Dorazio, P. (1985). Impressions of the technical writer: a master of many communication roles. Business & Professional Communication Quarterly,48(2), 16-20. Dorairaj, S., Noble, J., & Malik, P. (2012). Knowledge Management in Distributed Agile Software Development. Agile Conference (pp.64-73). IEEE. Du??, N. (2015). From theory to practice: the barriers to efficient communication in teacher-student relationship. Procedia - Social and Behavioral Sciences,187, 625-630. Eisenberg, E. M. (2010). Organizational communication: Balancing creativity and constraint. New York, NY: Saint Martin’s. Giammona, B. (2004). The future of technical communication: How innovation, technology, information management, and other forces are shaping the future of the profession. Technical communication 51:349 –366. Gould, J. R., & Losano, W. A. (2008). Opportunities in technical writing careers. New York: McGraw-Hill Hanssen, G. K., & Moe, N. B. (2011). Signs of Agile Trends in Global Software Engineering Research: A Tertiary Study. IEEE Sixth International Conference on Global Software Engineering Workshop(pp.17-23). IEEE Computer Society. Hansen, M. T. (1999). The search-transfer problem: the role of weak ties in sharing knowledge across organization subunits. Administrative Science Quarterly, 44(1), 82-111. Herbsleb, J. D., & Moitra, D. (2001). Global software development Software IEEE, 18(2), 16-20. Herbsleb, J. D., & Mockus, A. (2003). An empirical study of speed and communication in globally distributed software development IEEE Transactions on Software Engineering, 29(6), 481-494. Hofstede, G. (2010). Cultures and organizations: software of the mind. intercultural cooperation and its importance for survival. Southern Medical Journal, 13(3), S219–S222. Hummel, M., Rosenkranz, C., & Holten, R. (2013). The role of communication in agile systems development. Business & Information Systems Engineering, 5(5), 343-355. Kamaruddin, N. K., Arshad, N. H., & Mohamed, A. (2012). Chaos issues on communication in Agile Global Software Development.Business Engineering and Industrial Applications Colloquium (pp.394-398). IEEE. Keyton, J. (2011). Case studies for organizational communication: Understanding communication processes. New York, NY: Oxford University Press. Lanier, C. R. (2009). Analysis of the skills called for by technical communication employers in recruitment postings. Technical Communication,56(1), 51-61. Leuf, B., & Cunningham, W. (2001). The Wiki way: quick collaboration on the Web. Addison-Wesley Longman Publishing Co. Inc. Lee, M. F., & Mehlenbacher, B. (2000). Technical writer/subject-matter expert interaction: the writer's perspective, the organizational challenge. Technical Communication, 47(4), 544-552. Levy, M., & Hazzan, O. (2009). Knowledge management in practice: The case of agile software development. Cooperative and Human Aspects on Software Engineering, 2009. CHASE '09. ICSE Workshop on (pp.60-65). IEEE. Lipnack, J., & Stamps, J. (2000). Virtual teams: people working across boundaries with technology. New York: Wiley Mcallister, D. J. (1995). Affect- and cognition-based trust as foundations for interpersonal cooperation in organizations. Academy of Management Journal, 38(1), 24-59. Nonaka, I., & Takeuchi, H. (1996). The knowledge-creating company: how japanese companies create the dynamics of innovation. Journal of International Business Studies, 27(1), 196-201. Niinimaki, T., Piri, A., & Lassenius, C. (2009). Factors Affecting Audio and Text-Based Communication Media Choice in Global Software Development Projects. Fourth IEEE International Conference on Global Software Engineering (pp.153-162). IEEE Computer Society. Robbins, S. (2013). Organizational behavior. 15th ed. Prentice Hall. Shore, J., & Warden, S. (2008). The art of agile development. O’REILLY Slattery, S. (2007). Undistributing work through writing: how technical writers manage texts in complex information environments. Technical Communication Quarterly, 16(3), 311-325. Van Wicklen, J. (2001). The tech writer's survival guide: a comprehensive handbook for aspiring technical writers. Checkmark Books. Wendling, M., Oliveira, M., & Ma?ada, A. C. G. (2013). Knowledge sharing barriers in global teams. Journal of Systems & Information Technology, 15(3), 263 - 288. Yagüe, A., Garbajosa, J., Díaz, J., & González, E. (2016). An exploratory study in communication in agile global software development. Computer Standards & Interfaces, 48, 184-197. ﹀
馆藏号：	017/M2018(557)
公开日期：	2021-05-25

西方艺术史书籍中文化因素的翻译策略—以《20世纪的艺术》为例.马璇

链接

题名：	西方艺术史书籍中文化因素的翻译策略—以《20世纪的艺术》为例
姓名：	马璇
学号：	1401210681
论文语种：	chi
专业：	专业学 - 工程 - 计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2018-05-25
外文题名：	Translation Strategies of Cultural Factors in Western Art History Books─Taking Art of the 20th Century as an Example
关键词：	西方艺术史文化因素文化翻译翻译策略
外文关键词：	Western art history cultural factors cultural translation translation strategies
论文摘要：	︿西方艺术史书籍中含有大量的文化因素，如社会文化背景、艺术风格流派、艺术运动、材料技法、艺术思想理念等。但目前文化因素方面的翻译研究主要集中在文学领域，非文学领域中的研究较少，而对于艺术类题材的翻译往往一笔带过。 Art of the 20th Century（《20世纪的艺术》）一书讲述了20世纪西方国家的艺术史。本书按照时间顺序，探讨西方国家在绘画、雕塑及建筑领域的艺术风格流派及艺术家活动。本书将由人民美术出版社出版，笔者负责全书翻译。在此书第五、六、七、八章的翻译实践基础上，结合出版社的翻译质量要求，本文总结出文化因素翻译中需要遵循的三条原则：表意准确、一致性、直观易懂。并基于此三条原则，提出相应的翻译策略：文化辨析（明确背后内涵，避免望文生义；联系上下文语境推断；辨别作品名，注意一致性；结合作品图片进行分析）、文化置换（替换成所指代的具体含义；借鉴中文表达）、文化补偿（添加文化关联信息；挖掘深层文化内涵；添加图片辅助理解）。本文旨在研究西方艺术史书籍中文化因素的翻译策略，为艺术类作品的译者提供翻译经验与参考，同时促进东西方艺术文化的沟通与交流。﹀
外文摘要：	︿ Western art history books boast a large number of cultural factors, such as social backgrounds, artistic styles, material techniques, ideas, etc. However, at present, translation studies on cultural factors mainly focus on the field of literary translation. There are few studies on the translation of cultural factors in the non-literary field. What is worse, the translation of artistic topics is often skated over. The book Art of the 20th Century is a systematic introduction to western art in the 20th century. This book explores Western artistic styles and artist activities in the fields of painting, sculpture and architecture in chronological order. The translation will be published by the People's Fine Arts Publishing House. The author of the thesis am the translator of the whole book. On the basis of the translation practices of Chapters 5, 6, 7 and 8 of this book, combined with the translation quality requirements of the publisher, this thesis sums up three principles to be followed in the translation of cultural factors, that is, accuracy, consistency, as well accessibility. On the basis of the three principles, the corresponding translation strategies are proposed as follows: cultural discrimination (making clear the connotation behind and avoid the literal meaning; inferring from the context；identifying different names of the work for consistency; combining the pictures of the work to analyze), cultural substitution (replacing it with the specific meaning; borrowing from Chinese expressions), cultural compensation (adding culturally relevant information; digging deeper cultural connotations; adding pictures to assist understanding). ﹀
分类号：	H059
论文总页数：	242
参考文献总数：	60
参考文献列表：	︿ [1] 张平. 关于美术史翻译的一点心得[J]. 美术观察, 2016(8). [2] 宋若水. 试析杂糅理论在艺术史翻译中的应用[D]. 南京大学, 2016. [3] 露西-史密斯，殷企平等. 艺术词典[M]. 北京：生活?读书?新知三联书店, 2005. [4] 尼古斯?斯坦戈斯，范景中等. 艺术与艺术家词典[M]. 北京：生活?读书?新知三联书店, 2010. [5] 伊恩?希尔韦尔斯,王方等. 牛津艺术词典[M]. 北京：人民美术出版社, 2015. [6] 王其钧. 西方建筑图解词典[M]. 北京：机械工业出版社，2006:174-176 [7] Tylor E B. Primitive Culture, Vol 2 (7th ed.).[J]. New York Ny Us Brentanos Primitive Culture. 1871. [8] Sugiura K. Malinowski, B. The Dynamics of Culture Change, An Inquiry into Race Relations in Africa, Ed. by Kaberry, xiv, 171 pp. Yale University Press, 1945[J]. Japanese Journal of Ethnology, 1948, 12. [9] Sch?ffner C, Kelly-Holmes H. Cultural Functions of Translation[M]. Multilingual Matters, 1995. [10] 杨平.文化因素与翻译策略[J]. 北京第二外国语学院学报, 2006(4):21-27. [11] 王颖频, 侯云程. 从功能翻译角度看中国文化因素德译的五种翻译模型——以《中国文化常识》(中德对照)为例[J]. 德语人文研究, 2017(1):27-34. [12] Nida E. Language, Culture, and Translation[J]. Journal of Foreign Languages, 1998. [13] 姚洁. 电影字幕翻译中的文化因素传递—基于电影《赤壁(上)》的个案分析[J]. 华中师范大学研究生学报, 2010(1):80-84. [14] Eugene A.Nida, CharlesR.Taber. The Theory and Practice of Translation:翻译理论与实践[J]. 2004. [15] 周志培, 陈运香. 文化学与翻译[M]. 上海：华东理工大学出版社, 2013. [16] 刘婉儿. 文化翻译理论指导下玉石神话的英译策略研究[D]. 北京理工大学, 2015. [17] 刘宓庆. 文化翻译论纲[M]. 北京：中译出版社, 2016. [18] 张保红. 译者与文化翻译[J]. 天津外国语大学学报, 2004, 11(3):15-21. [19] 李媛媛. 动画片《花木兰》对白翻译策略的跨文化解读[J]. 电影文学, 2013, No.581(8):159-160. [20] 田博. 跨文化视域下的德语翻译研究[J]. 辽宁工业大学学报(社会科学版), 2016, 18(5):59-61. [21] 薛锋. 简明美术词典[M]. 黑龙江哈尔滨：黑龙江人民出版社, 1982. [22] 伊恩?博伊德?怀特, 克莱迪娅?海德, 贺慧玲. 艺术史与翻译[J]. 第欧根尼, 2013(1):149-162. [23] 钟书宇.从后殖民翻译理论看艺术史翻译[D]. 北京外国语大学, 2015. [24] 邵炜. 从傅雷《艺术哲学》的翻译看翻译的接受美学[J]. 四川外语学院学报, 2008, 24(6):88-92. [25] 彭筱. 艺术史类文本翻译研究—以《詹森艺术史》第十六章为例[D]. 苏州大学, 2015 [26] 王冬雪. “当代哈巴罗夫斯克边疆区造型艺术及相关艺术家”文本汉译实践报告[D]. 哈尔滨师范大学, 2016. [27] 方紫娟. 泰特勒“翻译三原则”关照下的艺术类稿件汉译[D]. 东华大学, 2016. [28] 黄艺平. 艺术文献专题翻译策略探究—以雕塑作品文献英译汉为例[J]. 中国科技翻译, 2011, 24(4):39-42. [29] 孙静艺, 王伦, 李雨晨. 美术文本汉英翻译语篇重构及其翻译策略[J]. 海外英语, 2016(19):126-127. [30] 刘平燕, 高军, 程雪芳. 艺术评论翻译审校特点和流程分析—以国画评论为例[J]. 江苏外语教学研究, 2015(3):67-70. [31] 范景中.《美术史的基本概念》中译本札记[J]. 文艺研究, 2013(8):128-131. [32] 徐霞. “比亚兹莱”的中国旅程—鲁迅编《比亚兹莱画选》有关文化、翻译、艺术的问题[J]. 鲁迅研究月刊, 2010(7):4-24. [33] 刘子文. 《剑桥艺术史》中译本商榷[J]. 美术观察, 2011(4):23-24. [34] 周婷. 艺术指导书籍中的翻译加注与编辑研究—以Art Recreations的翻译为例[D]. 北京大学, 2014. [35] 胡爱民. 俄汉辞书中美术词汇的翻译[J]. 辞书研究, 1984(6):88-92. [36] Grimm E. André Lefevere, Translation/History/Culture: A Sourcebook.[J]. Cadernos De Tradu??o, 1996, 1(1):págs. 371-372. [37] 朱青生:美术学院的历史与问题[M]. 广西师范大学出版社, 2012. [38] 殷凌云. 文体与文本—关于西方美术史著作的翻译[J]. 新美术, 2008, 29(4):79-84. [39] 金海. 水利工程技术标准术语和常用词翻译的一致性管理[J]. 中国科技翻译, 2015, 28(4):12-15. [40] 孙静艺, 王伦, 余孝平. 美术文本汉英翻译语篇重构研究[J]. 安徽文学(下半月), 2016(11):55-56. [41] 路佳. 中西方文化差异及其语言体现[J]. 山东外语教学, 1999(1):83-85. [42] "The Chicago School, Beaux-Arts, and the City Beautiful." The Greenwood Encyclopedia of American Regional Cultures: The Midwest. Santa Barbara: ABC-CLIO, 2004. Credo Reference. Web. 25 March 2011. [43] Sullivan, Louis H. (1896). The Tall Office Building Artistically Considered. Getty Research Institute. [44] 张聪聪. “芝加哥学派”产生及发展过程的考察(1871-1925)[D]. 北京大学, 2013. [45] 傅志毅. 西方现代艺术中媒介选择的扩延及其意义[J].中国美术学院, 2007, 28(8):109-112. [46] 隋志娟. 翻译中的文化因素及其对策研究[D]. 山东师范大学, 2003. [47] 王秉钦.文化翻译学[J]. 中国俄语教学, 1998(4):41-41. [48] 陶云．西方艺术设计流派和设计思想[M]．江苏扬州：东南大学出版社，2007年6月 [49] 弗兰克?劳埃德?赖特. 一部自传:弗兰克?劳埃德?赖特[M]. 上海人民出版社, 2014. [50] 何亚琴. 日本因素对现代主义建筑大师赖特和密斯的影响[D]. 中央美术学院, 2016. [51] 朱仁洲. 赖特早期作品中的东方因素研究[D]. 南京艺术学院, 2006. [52] Sapir E, Mandelbaum D G. Culture, Language and Personality : Selected Essays[M]. University of California Press, 1949. [53] 刘山. 翻译与文化[J]. 中国翻译, 1982(5):5-8. [54] 孙迎春. 张谷若翻译艺术研究[M]. 中国对外翻译出版公司, 2004. [55] 庞朴. 文化结构与近代中国[J]. 中国社会科学, 1986(5):81-98. [56] 王颖频, 滕硕. 顺应论视角下网络双语展示中文化特色词汇的翻译研究—以“中德文化网”为例[J]. 苏州大学学报:哲学社会科学版, 2013(6):163-169. [57] 佚名. 建筑学名词[M]. 北京：科学出版社, 2014. [58] 张金凤. 现代寓言《弗兰肯斯坦》[J]. 解放军外国语学院学报, 2008, 31(2):94-98. [59] 高亮华. 技术失控与人的责任——论弗兰肯斯坦问题[J]. 科学与社会, 2016, 6(3):128-135. [60] Bryan Garner. A Dictionary of Modern American Usage[M]. New York, Oxford: Oxford University Press, 1998 ﹀
馆藏号：	017/M2018(619)
公开日期：	2018-05-25

2018-05-24

时态成分对句子的语义贡献.郑莉莉

链接

题名：	时态成分对句子的语义贡献
作者：	郑莉莉
学号：	1501213130
语种：	eng
专业：	文学 - 外国语言文学 - 外国语言学及应用语言学
公开时间：	3年后
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师姓名：	何卫
导师单位：	外国语学院
答辩日期：	2018-05-24
题目(外文)：	The semantic contributions of tense elements to sentences
关键字(中文)：	真值条件语义贡献时态生成句法
关键字(外文)：	truth condition semantic contribution tense generative grammar
文摘：	︿本文旨在探求时态成分对句子的真值语义贡献。为此，本文考察了弗雷格的“函数-主目”分析方法和组合原则，并紧接着回顾了现代语言学两种典型的句子分析方法，其一为短语结构语法，其二为X-bar理论。经过考察，无论是X-bar语法还是其后的最简方案，语言表达式都呈现向心结构，这极大地便利了句法的函数运算，故本文采取X-bar语法以及合并运算作为基本的句法框架。而后范畴语法的引入主要是为解决乔姆斯基语法系统中缺少最小元和最大元的问题。在范畴语法中，最大元和最小元分别是句子s和名词n。本文认为，在生成句法系统中，对应的最大元是IP，而IP作为最大元的标记即为时态。为了探讨时态成分对整个句子的语义贡献，本文首先在生成句法框架的基础上，构建了从N到无时态VP的语义，然后批判考察了传统语法著作和时态逻辑对英语时态的解释。本文认为，一旦将传统语法对英语时态的解释，归结到生成语法的句法范畴I之下，人们就可以在不借助任何逻辑语言表达式的情况下，通过函数运算，用一个统一的框架解释时态对句子的真值语义贡献。﹀
文摘（外文）：	︿ This paper investigates the semantic contributions of tense elements to sentences. To achieve that purpose, Frege’s function-argument approach to syntactic analysis and the compositional semantics are examined, followed up with a critical review of two major approaches linguists commonly take to sentence analysis, i.e. the Phrase Structure Grammar and the X-bar Theory. The X-bar schema of generative grammar and its minimalist version – Merge - are chosen as the syntactic framework for further semantic analysis, because both the X-bar grammar and the binary operation of Merge suggest the endocentric property of all generated linguistic constructions, which can be reduced further to function-argument constructions. To solve the problem that there is neither the least nor the greatest element in the generative framework, insights from the categorial grammar is borrowed. While sentence s and noun n are accepted as the greatest and the least element respectively in the categorial grammar, it is argued that the corresponding IP in the generative system is the greatest element, which is marked by the tense element. In order to explore the truth conditional contributions of tense elements to sentences, this paper first of all constructs the meaning from N to untensed VP based on the generative syntax; and then, by a critically review of the tense interpretation in the traditional grammars and the tense logic, it is claimed that the functional category I in the generative grammar, once incorporated with the tense interpretation provided by the traditional grammarian such as Jespersen and Quirk, is able to give a unified account of how tense elements contribute to the truth condition of the whole sentence even without the aid of any tense logical representation. ﹀
分类号：	H04
论文总页数：	54
参考文献数：	26
参考文献：	︿ Adger, D. (2003). Core Syntax: A Minimalist Approach (Vol. 33). Oxford: Oxford University Press. Areces, C., & Blackburn, P. (2005). Reichenbach, Prior and Montague: A Semantic Get-together. In We Will Show Them!(1). 77-88. Baghramian, M. (1999). Modern Philosophy of Language. Washington, D.C.: Counterpoint. Blackburn, P. (1994). Tense, Temporal Reference, and Tense Logic. Journal of Semantics, 11(1-2), 83-101. Blackburn, P., & J?rgensen, K. F. (2016). Reichenbach, Prior and Hybrid Tense Logic. Synthese, 193(11), 3677-3689. Carnie, A. (2013). Syntax: A Generative Introduction. West Sussex: John Wiley & Sons. Chomsky, N., (1993). Lectures on Government and Binding. New York: Mouton de Gruyter. Chomsky, N. (2008). On Phases. Current Studies in Linguistics Series, 45, 133-166. Davey, B. A., & Priestley, H. A. (2002). Introduction to Lattices and Order. Cambridge: Cambridge university press. Declerck, R. (1986). From Reichenbach (1947) to Comrie (1985) and Beyond: Towards A Theory of Tense. Lingua, 70(4), 305-364. Geach, P., & Black M, (1960). Translations from the Philosophical Writings of Gottlob Frege. Oxford: Basil Blackwell. Giorgi, A., & Pianesi, F. (1997). Tense and Aspect: From Semantics to Morphosyntax. New York: Oxford University Press. Hu, Z., & Jiang, W. (2002). Linguistics: An Advanced Course Book. 北京：北京大学出版社. Jespersen, O. (1933). Essentials of English grammar. London: Routledge. Jespersen, O. (1949). A Modern English Grammar: On Historical Principles. Part IV. Denmark: Aalborg Stiftsbogtrykkeri. Kamp, H., & Partee, B. (1995). Prototype theory and compositionality, Cognistion, 57(2), 129-191. Lambek, J. (1958). The Mathematics of Sentence Structure. The American Mathematical Monthly, Vol. 65, No. 3. 154-170. Michaelis, L. A. (2006). Tense in English. The Handbook of English Linguistics, 220-243. Morrill, G. (2010). Categorial Grammar: Logical Syntax, Semantics, and Processing. Oxford University Press. Partee, B. (1995). Lexical Semantics and Compositionality. An invitation to Cognitive Science: Language, 1, 311-360. Partee, B. (2007). Compositionality and Coercion in Semantics: The Dynamics of Adjective Meaning. Cognitive Foundations of Interpretation, 145-161. Pelletier, F. (2001). Did Frege Believe Frege's Principle? Journal of Logic, Language, and Information, 10(1), 87-114. Prior, A. N. (1967). Past, Present and Future, Oxford: Clarendon Press. Quirk, R. et al. (2010). A Comprehensive Grammar of the English language. New York: Longman Group Limited. Reichenbach, H. (1947). Elements of Symbolic Logic, New York: Macmillan. https://corpus.byu.edu/coca/ ﹀
馆藏号：	039/M2018(04)
公开日期：	2021-05-24

2018-05-19

《公安基层派出所激励机制设计与应用》.吴书光

链接

题名：	《公安基层派出所激励机制设计与应用》
作者：	吴书光
学号：	1301221697
语种：	chi
专业：	专业学 - 工程 - 软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2018-05-19
关键字(中文)：	公安派出所激励机制激励模型激励策略
文摘：	︿近年来，随着我国公安制度的逐步完善，公安队伍的构成发生了显著变化。基层民警作为当前公安队伍的重要组成部分，成为决定公安工作未来建设发展水平的关键。如何重点解决好他们的激励问题，最大限度地挖掘出他们的潜能，是公安队伍管理亟待解决的课题。当前，公安基层民警主要采取薪酬、考核、晋升、竞争等这几种主要激励方式，取得了一定的效果。但仍存在以下问题：偏离组织目标问题、缺少针对性问题、差异性激励问题、激励方式简单化问题、缺乏激励问题上的系统思维等。原因主要有：一是“以人为本”思想尚未真正确立；二是对民警的特殊利益诉求认识不清；三是对激励理论与实践研究学习还不够深入；四是对思想政治教育的激励功能重视不够。需要是激励的起点，公安民警有着特殊的主导需要，我们就要以民警的需要为出发点，进行系统的探究和分析。本文的创新点是在充分调研访谈基层民警激励现状的基础上，初步整合了公安民警主导需要，设计了公安基层民警激励机制模型，该模型包括四个相互联系、互相支撑的系统，激励环境系统、激励因素系统、激励运行系统，以及激励反馈系统。构建部门基层民警激励机制：一是要强化思想政治教育激励功能，二是要建立需要分析制度，三是要建立民警职业规划制度，四是要完善民警积分制考核制度，五是要培育激励性警营文化。在此基础上用积分制激励这一具体可抓的措施建立整体的激励体系。本文在了解国内外相关的激励理论和企业的激励模式的基础上，综合调研、访谈基层民警的意见和心声，根据构建的激励机制模型和策略的指导，通过与其他几种激励方法比较，最终选择了使用积分制激励模式，通过建立基层民警积分制体系，进行初步的试验，在基层派出所实施后取得了良好的效果，但也反映出一些问题，需要继续的改进和完善。﹀
分类号：	TP3
论文总页数：	63
参考文献数：	69
参考文献：	︿ [1]戴维·奥斯本.彼得·普拉斯特里克.政府改革手册：战略与工具[M].中国人民大学出版社.2004年版.102. [2]胡祎.曹娜等.美国警察制度与中国警察制度之比较——浅析我国警察制度的完善[J].天府新论.2007.(1):17. [3]张承志.王军等.内地与香港警察福利待遇保障之比较[J].江西公安专科学校公报.2010.（5）:4. [4]张志祥.英国的警察体制[J].山西警官高等专科学校学报.2002.（1）:19. [5]蒋硕亮.中国公务员复合利益均衡激励论[M].北京：北京大学出版社.2008.150-151. [6]刘鹏.浅谈公安教育改革对公安人力资源获取的影响[J].甘肃警察职业学院学报.2010.（1）:5. [7]蔡炎斌.多元化警察激励机制的构建[J].湖南公安高等专科学校学报.2008.（4）:6. [8]陈凌梅.警察激励机制在公安管理工作中的重要作用[J].职业圈.2007.（2）:42. [9]安瑛.警察工作倦怠的组织干预策略[J].中国人民公安大学学报（社会科学版）.2010.（3）:5. [10]丛林.浅谈建立完善警察激励机制的设想[J].科技信息（学术研究）.2008.（5）：34. [11]夏尧江.王英毅.浅议警察多元化激励机制的建立和完善[J].公安学刊。浙江公安高等专科学校学报.2003.（4）:18. [12]赵广庆。警察情感激励研究[J].贵州警官职业学院学报.2007.（6）:28. [13]刘正周。管理激励与激励机制[J].昆明理工大学学报.1996.（5）：16-17. [14]钱颖一.目标与过程[J].经济社会体制比较.1999.（2）：20-21. [15]张强.论公务员激励机制的生态性[J].理论与改革，2001.（2）:7-9. [16]俞文钊.怎样建立现代企业制度中的激励机制[J].经济师.1996.（10）:22-23. [17]黄希庭.青年学生自我价值感量表的编制[J].心理科学.1998.（4）:30. [18]任奇.试论电信企业激励机制的改革与创新[J].钦州师范高等专科学校学报.2001.（4）：25-26. [19]张望军.中国企业知识型员工激励机制实证分析[J].科研管理2001.（6）:30-32. [20]胡迟.利益相关者激励[M].北京：经济管理出版社.2003.150-152. [21]AK Paul, RN Anantharaman. Im Paetof Peo Plemanagement Praetieeson Organizational Performanee: analysis of aeausalmodel[J]. Huxnan Resource Management, 2003 [22]GeorgeT.Milkoviehand Johnw.Boudreau Hufuan Resouree Management,RehardD.Irwin,1994 [23]M Amlstrong, A Baro. Performanee managemeni[J].Humanresource management, 2000 [24]M J Lebas. Performanee measurement and Performanee managementl[J].nternational Journal of Production Economies,1995 [25]William P Anthony, Stragegic Human Resoure Management [M]. The Dryden Press, Florida, 2007 [26] 王承先.企业员工激励技术[M].广州：广东经济出版社,2003:25. [27] 张明玉.管理学[M].北京：科学出版社,2005:244. [28] 俞文钊.管理心理学[M].兰州:甘肃人民出版社,1998:198. [29] 美，格里芬，刘伟译.管理学[M].北京:中国市场出版社,2006:343. [30] 赵忠令,胡月星.现代领导心理[M].北京:中国社会科学出版社,2003:245. [31] 王中力.激励论[M].山西人民出版社,1992:39. [32] 美,哈格斯（Hughes,R.L.）,吉纳特（Ginnett,R.C.）,柯菲（Curphy,G.J.）.Leadership: Enhancing the Lessons of Experience,5e,影印本[M].北京:清华大学出版社，2006:243. [33] 黎志成,侯锡林.简评管理学中的激励理论[J].科技进步与对策,2003;12:182. [34] wilson, R. The Structure of Incentives for Decentralization Under Uncertainty. LaDecision, 1963:171. [35] 张维迎.博弈论与信息经济学[M].上海:上海人民出版社,1996: [36] 德姆塞茨.关于产权的理论[J].载于《财产权利与制度变迁》,上海三联书店,1994：134. [37] 美,拉格斯,霍尔特休斯,吕巍等译．知识优势:新经济时代市场制胜之道（第一版）[M]．北京:机械工业出版社,2002:257-263. [38] 美, 保罗·S·麦耶斯，蒋惠工译．知识管理与组织设计（第一版）[M]. 珠海: 珠海出版社，1998:251-265. [39] Future Work（1994）转引自安盛咨询公司的研究报告：Workforce of the 21st Century．1998． [40] 张望军,彭剑锋．中国企业知识型员工激励机制实证研究[J].科研管理,2001;11:90-97. [41] 郑超,黄攸立．国有企业知识型员工激励机制的现状调查及改进策略[J].华东经济管理, 2001;6:30-33. [42] 陈井安,景光仪．知识型员工激励因素的实证研究[J].科学学与科学技术管理.2005;8： 101-105. [43] 澳,欧文·E·休斯，彭和平,周明德,金竹青等译.公共管理导论[M].北京:中国人民大学出版社，2001:212. [44] 美,B·盖伊·彼得斯，吴爱民,夏宏图译.政府未来的治理模式[M].北京:中国人民大学出版社,2001. [45] 袁明全,赵昌彦.军人收入分配概论[M].北京:海潮出版社，2001. [46] 何铁彦,黄瑞新.关于控制军事人才流失的辩证思考[J].军事经济学院学报，2002;4. [47] 张永周.论军事人才外流成本的控制[J].军事经济研究，2002;3. [48] 陈雄智,张良,赵玉忠.军队激励性薪酬体系设计[J].军事经济研究,2005;12. [49] 李辉亿,路萍,塞沙.增强警察工资制度的激励功能[J].军事经济研究,2005;11. [50] 陈颖.军人求利的特殊性与利益激励的新视角[J].政工学刊,2001;10. [51] 刘宁.论军队人力资源管理中战略性激励机制的引入[J].军事经济研究,2006;2. [52] 任凤鸣,张银辉.健全和优化科技民警队伍激励机制[J].西安政治学院学报,2004;1（6） . [53] 江泽民.江泽民文选第三卷[M].北京:人民出版社,2006:418. [54] 毛泽东.毛泽东选集第一卷[M].北京:人民出版社,1991:294. [55] 马克思,恩格斯.马克思恩格斯选集第一卷[M].北京:人民出版社,1972:82. [56] 列宁.列宁全集第二卷[M].北京:人民出版社,1984:443. [57] 彭聃龄.普通心理学（修订版）[M].北京:北京师范大学出版社,2004:327. [58] 马克思,恩格斯.马克思恩格斯全集第三卷[M].北京:人民出版社,1960:286,326. [59] 刘正周.管理激励[M].上海:上海财经大学出版社,1998:51,101. [60] 马克思,恩格斯.马克思恩格斯全集第四十七卷[M].北京:人民出版社,1979:43. [61] 美,彼得·德鲁克.管理——任务、责任、实践[M].北京:中国社会科学出版社,1995:121. [62] 卢宏,田桂茹.机关民警有效工作经验要诀[M].南京:南京陆军指挥学院出版社,2002:41. [63] 邓小平.邓小平文选第二卷[M].北京:人民出版社,1994:146. [64] 李烈满.健全民警选拔任用机制问题研究[M].北京:中国社会科学出版社,2004:278. [65] 申来潭.思想政治教育的激励功能[J],理论与实践，2002;2:31-32. [66] 裴士连.新时期军队思想政治教育激励功能的调试 [J],南京政治学院学报, 2005; 6(21): 106-108. [67] 美,克拉姆斯,王荣译.赢之道[M].北京:中信出版社,2005:24. [68] 刘沂,赵同文.公共部门人力资源管理概论[M].上海:华东理工大学出版社,2002:111. [69] 石含英,王荣桢.世界管理经典著作精选第一版.北京:企业管理出版,1995:276. ﹀
馆藏号：	017/M2018(332)
公开日期：	2021-05-19

2017-11-29

针对写前准备阶段的英语写作训练系统前端设计和实现.王勤晓

链接

题名：	针对写前准备阶段的英语写作训练系统前端设计和实现
姓名：	王勤晓
学号：	1401210749
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-11-29
外文题名：	The Front-End Design and Implementation of English Writing Training System in Preparation Stage
关键词：	写前阶段写前准备语块教学移动学习前端设计和实现
外文关键词：	Pre-writing stage Writing preparation Chunk teaching Mobile learning Front-end design and implementation
论文摘要：	︿写作是英语考试中重要的组成部分，也是英语学习中的难点。学术界公认的教学理论实践认为写前准备阶段不足，即语言输入及知识积累不足，是国内学生英语写作困难的症结所在。目前市面上不乏各种英语写作辅助软件和英语作文自动批改软件，但这些软件或针对写作过程，或注重结果反馈，少有针对英语写作写前准备阶段的学习系统。另外，市面上现有的英语语言学习应用，虽然功能齐全，但没有跟写作结合起来，脱离了使用场景，而且很多应用缺少科学的写作教学法和语言学习理论指导。随着移动设备的普及和信息科学技术的发展，移动自适应学习已经成为在线教育的趋势，但是还没有英语写作学习应用涉及此领域。针对这些问题，本文在分析和调查了目标用户的问题和需求的前提下，结合移动学习的特点，基于英语写作教学法和语言学习理论，研发出一款针对英语写前准备阶段的移动学习系统。与常规的写作教学模式相比，该系统的核心特点是，在写作主题下，以语块作为基本学习单元，区别于传统的词汇和语法学习环节，利用整取整存的方式进行写作输入和输出，结合软件系统的优化实现，帮助学生进行写前准备。确定了写作主题下语块学习的大框架之后，本系统通过“主题预制语块学习”和“主题范文语块学习”两个任务主线进行写作语块的学习。“主题预制语块学习”任务提供与写作主题相关的预制语块，帮助用户在移动系统环境下进行语块输入和输出；“主题范文语块学习”任务以一篇主题范文展开，通过篇章内核心语块学习，帮助用户学习范文的语言知识和逻辑结构。在每个任务中，本文基于语块教学实践和移动系统的特点，将语块学习过程分为语块优选、语块输入、语块内化、语块输出四个阶段。本文对写作语块学习的四个阶段设计进行了系统的探讨，并选取了学习过程中具有代表性的学习流程设计和界面交互设计问题进行了相应的试验，论证了本文提出的设计方案在降低用户认知负荷和帮助学习者达成学习目标方面的有效性。在前端系统实现方面，本系统选择了混合模式移动应用开发的方式，使用Web应用开发技术进行Android和iOS应用开发。本文对系统的技术架构和前端页面结构等关键技术部分进行了探讨。在前端性能优化方面，本系统根据本项目的特点，对DOM操作进行了重点优化，并通过实验测试，证明了本系统前端性能优化的有效性。﹀
外文摘要：	︿ Writing is an important part of English examination, and it is also a difficulty in English study. It is generally accepted by the academic circle that the lack of language preparation and insufficient knowledge accumulation is the crux of the domestic students' English writing difficulties. At present there are many English writing assistant softwares and English writing automatic correction softwares, but they mainly focus on the writing process or the writing product, there are few softwares for the preparation stage of English writing. In addition, the existing English language learning applications on the market, although functional, but they are less to do with writing learning, and many of them lack of scientific writing teaching and language learning theory guidance. With the popularity of mobile devices and the development of information science and technology, mobile learning and adaptive learning have become the trend of online education, but there is no English writing learning application involved in this field. To solve these problems, this paper analyzes and investigates the problems and needs of target users, combined with the characteristics of mobile learning, based on the English writing teaching and learning theory, developed a mobile English writing learning system in preparation stage. Compared with the conventional writing teaching mode, the core characteristic of the system is that under the writing topic, the chunks are used as the basic learning unit, which is different from the traditional vocabulary and grammar learning. The system practices the writing input and output in a chunk way, combined with the optimization of software systems, to help students prepare before writing. After defining the framework of chunk learning under the writing topic, the system studies the writing chunks through two main tasks: "Prefabricated chunks learning" and "Model essay's chunks learning". "Prefabricated chunks learning" task provides prefabricated chunks related to writing topics to help users input and output chunks in a mobile system environment; "Model essay's chunks learning" task helps users to learn the language knowledge and logical structure of the model essay through the learning of core chunks in the passage. In each task, based on the teaching practices of chunks and the characteristics of mobile system, the chunks learning process is divided into four stages: chunk optimization, chunk input, chunk internalization, and chunk output. This paper systematically discusses the four stages of writing learning chunks, and selects the representative learning process design and interface design and carries out experimental tests to demonstrate the effectiveness of the proposed design in reducing user cognitive load and helping learners achieve learning goals. In the front-end system implementation, the system chose the hybrid application development approach, using the web application development technology to develop Android and iOS applications. This paper discusses the technical architecture and the front-end page structure of the system. In the aspect of front-end performance optimization, the system optimizes the DOM operation according to the characteristics of the system, and proves the effectiveness of the front-end performance optimization through the test. ﹀
分类号：	TP311
论文总页数：	80
参考文献总数：	73
馆藏号：	017/M2017(840)
公开日期：	2017-11-29

情感化设计在技术文档中的应用研究.余瑶

链接

题名：	情感化设计在技术文档中的应用研究
姓名：	余瑶
学号：	1301211059
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	高志军
论文答辩日期：	2017-11-29
关键词：	技术文档情感化设计网页用户文档
论文摘要：	︿技术文档作为连接用户和产品之间的桥梁，指导着用户如何有效、正确地进行产品操作，在技术日新月异的当今社会，其传播价值毋庸置疑。然而另一方面，技术文档的使用情况却不甚理想。虽然大多数用户能认识到文档的重要性，但由于使用体验不佳等问题，导致技术文档的使用率迟迟不能得到提升。因此，针对技术文档的设计创新势在必行。以往对于技术文档的设计研究，往往集中在提升可用性方面，但用户体验的提升不仅仅需要良好的可用性作为支撑，也同样需要保证用户的情感体验。而情感化设计理念为解决上述问题提供了良好的契机。因此，本研究中尝试将情感化设计方法应用到技术文档，尤其是网页式用户文档的设计中，并就其对用户产生的影响进行实证研究。本研究首先对技术文档的情感目标进行了分解分析，通过组织用户调查确定了在使用技术文档过程中相关度最高的三种消极情感：焦虑、无聊和疑惑。其后，通过对用户进行访谈确定了与情感相关的设计要点，并结合情感化设计理论和网页端用户文档的形式和内容特点，提出了文档的情感化设计策略。最后通过实验对改写后文档的情感效果及对用户产生的影响进行了验证。实验结果显示，与原文档相比，经过特定策略设计过后的文档所产生的消极情感明显降低，在单一维度情感方面，则能显著缓解用户的无聊感。使用效果方面，虽然情感化文档并不能显著影响用户的初次任务和迁移任务表现，甚至对用户的初次任务表现有一定的阻碍作用，但是对于用户的保持性任务表现有明显的提升作用，即情感化文档可以促进用户对文档内容的长期记忆。同时，情感化文档的使用能显著提升用户的动机水平。由实验可知，经过情感化设计的文档在优化了部分使用效果的同时，没有造成明显的负面影响，其应用前景看好。在实际应用中，可以考虑将情感化策略引入到技术文档的设计中。﹀
分类号：	H087/TP391
论文总页数：	86
参考文献总数：	53
馆藏号：	017/M2017(856)
公开日期：	2017-11-29

引申译法在英汉翻译中的应用——以《我与克鲁克的边境军旅生活》为例.陈纯

链接

题名：	引申译法在英汉翻译中的应用——以《我与克鲁克的边境军旅生活》为例
姓名：	陈纯
学号：	1301210567
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2017-11-29
外文题名：	The Application of Extension in English-Chinese Translation: A Case Study of On the Border with Crook
关键词：	文学翻译引申翻译策略英汉翻译
外文关键词：	Literary translation Extension Translation methods English-Chinese translation
论文摘要：	︿《我与克鲁克的边境军旅生活》是一本回忆录，由美国人约翰·格雷戈里·伯克撰写，于1891年出版。这本书讲述19世纪70、80年代，作者作为军队军需官在美国西南部和印第安人作战的事。作者不仅描写了军队行军打仗的情况，还详细介绍了美国西南地区的生态环境和当地居民的日常生活。笔者在翻译此书的过程中，发现书中表达上蕴含大量晦涩的信息，按字面意思翻译得到的译文效果并不好，需要采用引申译法进行翻译，这使得笔者对引申译法产生兴趣，并将引申译法作为研究主题。笔者将引申分为以下三大类：语言语境引申、文化引申和艺术引申，然后借助词典、互联网资源和语料库等翻译辅助工具，研究翻译实践中的典型译例，总结引申译法在翻译过程中的具体实现手段及局限性。引申译法的具体手段包括：遵照逻辑，挖掘语义实质法；语义抽象化与具体化译法；替代译法；凸显情感法；发散联想，润饰语言法。引申译法的局限性主要表现在以下三方面：对于内涵意义已为目的语读者所熟知的词汇，无需引申；对于能增添文章审美价值、能带给读者审美愉悦和精神享受的词汇，无需引申；一般情况下，引申后的译文中不宜出现目的语民族文化浓厚的词汇。﹀
外文摘要：	︿ On the Border with Crook, published in 1891, is a memoir written by American writer John Gregory Bourke. This book tells Bourke’s experience as an army quartermaster in the war against the Indians in the Southwest United States in the 1870s and 1880s. Bourke depicts not only military life but also the ecological environment and the daily life of the local residents in the Southwest of the United States. During the process of translation, the author came across many words with obscure meaning, which, if translated literally, would not produce a good translated text. As a result, the author proposes to adopt the extension policy. The author classifies the extension into three categories, namely linguistic contextual extension, cultural extension, and artistic extension. With the computer aided translation tools including dictionaries, Internet resources and corpora, the author studies the translation examples and discusses the extension methods and limitations in the process of translation. The extension methods include extracting semantic essence by logic, semantic abstraction and concretization, replacement, revealing emotions, and divergent association with language polishing. The limitations of the extension policy exist in the following three aspects: no extension is needed for the words whose connotation is familiar to the target readers; no extension is needed for the words which can add to the text’s aesthetic value and bring readers aesthetic pleasure; the extension methods should not produce in a translated text words with heavily loaded meanings in the target culture. ﹀
分类号：	H087/TP391
论文总页数：	201
参考文献总数：	31
馆藏号：	017/M2017(871)
公开日期：	2017-11-29

2017-11-20

基于任务型和游戏化的高职词汇教学研究.李尚

链接

题名：	基于任务型和游戏化的高职词汇教学研究
姓名：	李尚
学号：	1401210611
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-11-20
外文题名：	Research on Higher Vocational College English Vocabulary Teaching Based on Task-Driven Method and Gamification
关键词：	高职词汇习得任务型游戏化
外文关键词：	L2 vocabulary acquisition task-driven gamification
论文摘要：	︿词汇是语言的基础,掌握好一定的词汇知识是熟练使用英语的前提。随着信息技术的进一步发展,我国的高职课堂词汇教学取得显著进展的同时,也存在着较大的改善空间。高职学生作为我国教育发展进程中不可或缺的重要群体,较少的英语词汇知识储备、较强的抵触心理是阻碍他们进一步学习英语的屏障。现有高职词汇的课堂教学安排忽略了高职语言教学的重点在于交际使用;课堂教学过程以教师为中心,缺乏对学生个性差异的关注,不能很好地调动学生的积极性和参与度;对学生词汇的学习与练习情况缺乏实时的数据追踪、监督与反馈。本研究以高职英语词汇教学为切入点,基于任务型教学和第二语言词汇习得研究成果,结合游戏化机制及相关研究工具,提出基于任务型教学和游戏化的词汇教学方法, 主要包括以下三个部分:词汇学习资料、词汇学习任务、游戏化练习系统。其中,词汇学习资料主要是基于高职高专“以实用为主”的培养目标,在资深英语授课老师的协助下,以真实的语境为主,根据高职教材选择相应的学习词汇;词汇学习任务基于 Willis 提出的任务型教学框架进行优化,以任务为单位进行词汇的学习;游戏化练习系统以词汇练习为主,引入竞争、排名、实时反馈等游戏元素,从而为每个学生提供沉浸式的词汇练习环境,并对最终的练习结果进行记录和追踪,为学生提供有效的数据反馈。为了对方法的有效性进行验证,本研究在山东省济南市某高职院校开展了为期五周的教学实验,以 60 名高职一年级学生为实验对象。实验组采用本研究提出的基于任务型和游戏化机制的词汇教学方法,基于任务进行词汇学习、基于游戏进行练习,在实验过程中对个体学生的学习情况进行记录、反馈和分析;对照组采用传统非任务型非游戏化的教学模式,词汇学习按照传统流程以老师为中心进行词汇的讲解、练习和产出。所有被试的学习资料完全相同,反馈以分数为主。研究结果表明:本研究提出的方法在有效提升高职学生对英语词汇学习的积极性和参与度的同时,明显促进了他们的词汇习得和保持效果,尤其是成绩在中下游阶段的学生。在游戏化因素的选择偏向上,大部分同学对本文引入的关卡环节表示认同。最终,本方案的完整词汇设计流程受到学生的喜爱与欢迎。﹀
外文摘要：	︿ For the Chinese vocational college students, lacking enough vocabulary reserves and strong psychological resistance to the L2 vocabulary learning process have become barriers to their further study. Analyses of the vocabulary classroom teaching have suggested the student's bad performance linked to the center of the teacher and neglect of their differences, the gamification has been put forward to solve the problem. But the related empirical study is insufficient to resolve the trends. To estimate the effect of gamification to the L2 vocabulary teaching and learning, we combine the task-driven method with the game mechanism to optimize the classroom study process. This paper presents the scheme's rationality and popularity we designed for the vocational college group. The final analysis results indicate that the vocabulary teaching method based on task-driven and gamification can better improve the enthusiasm and engagement of students in English vocabulary learning, besides. What's more, it can also effectively promote their vocabulary acquisition and retention, especially for the students with the lower English proficiency. The majority of the students prefer the level-designing comparing with the other game elements. In conclusion, the whole learning process win the popularity of the students; this trend will need to be considered in future studies of gamification in the L2 vocabulary acquisition. ﹀
分类号：	H087/TP391
论文总页数：	80
参考文献总数：	85
馆藏号：	017/M2017(841)
公开日期：	2017-11-20

2017-05-21

H市派出所民警工作特征对其职业倦怠的影响研究.王翔宇

链接

题名：	H市派出所民警工作特征对其职业倦怠的影响研究
姓名：	王翔宇
学号：	1301221671
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工学硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	Research for The Influence of Job Characteristics on Job Burnout in Police Substation Police of H City
关键词：	工作特征职业倦怠派出所民警
论文摘要：	︿摘要近些年来，入警考试越来越炙手可热，然而，与此相对应的却是公安民警居高不下的职业倦怠率。如今，我们国家社会贫富差距在拉大，国内格局在改变，国外恐怖组织频繁作案，局势并不乐观。内忧外患的局面，加重了公安机关的工作职责。以H市派出所民警为例，他们处在为人民服务的第一线，该市公安机关70%-80%的工作职责要由他们具体落实，其工作职责尤为沉重。然而，在工作强度不断增大，工作密度、危险性直线上升，急需充足的民警力量来充实派出所战斗力的现实状态下，H市派出所民警占该市公安机关民警的比例仅为50%左右。再加上该市警民冲突事件时有发生，某些媒体和舆论的评价有失偏颇，该市民众也因此对H市派出所民警颇有不满。种种原因综合在一起，致使H市派出所民警工作压力越来越大，职业倦怠率不断升高。民警的工作特征可以被用来检测工作过程中的消极怠倦感等相关问题，而现有研究中，派出所民警工作特征对职业倦怠的影响的研究十分罕见。因此，本文旨在从民警工作的基本特征入手，使用依据H市派出所民警实际情况和已有量表自行创新改进而成的基本信息问卷、工作特征问卷及成熟的职业倦怠问卷来调查研究分析H市派出所民警的职业倦怠具体状况。其中，工作特征模型采用Hackham和Oldham提出的工作特征模型理论，职业倦怠理论模型采用Maslach等人提出的三维度职业倦怠理论。进而，以自行创新改进而成的基本信息问卷、工作特征问卷及成熟的职业倦怠问卷来收集H市派出所民警的相关数据，并通过数据分析、案例分析及访谈情况分析，得出该市派出所民警工作特征五核心维度与职业倦怠的相关性关系。本文表明，目前，H市派出所民警存在着普遍的职业倦怠状况；工作特征五核心维度对该市派出所民警的职业倦怠有显著的影响。其中技能多样性与职业倦怠显著正相关，工作特征其他四个维度都与职业倦怠显著负相关。通过访谈得出,产生这种结果的具体原因有：H市派出所民警的职业化程度差，一人多用；警力不足；无尾案件增多，派出所民警工作自主性差等；另外，H市派出所民警的心理素质、职业能力等都是不可忽视的因素。结合分析结果，本文从H市派出所民警工作特征的五大维度及民警的心理状态出发，有针对性地提出有助于消减H市派出所民警职业倦怠的对策，并在H市实施本文提出的对策。从实施的效果来看，该对策收到了良好的效果，即该对策能够消减H市派出所民警的职业倦怠程度，提升其工作积极性。﹀
分类号：	TP315
论文总页数：	100
参考文献总数：	0
馆藏号：	017/M2017(462)
公开日期：	2017-05-21

一个政府创新券申请书分析系统的设计与实现.周世洋

链接

题名：	一个政府创新券申请书分析系统的设计与实现
姓名：	周世洋
学号：	1401210894
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	The Design and Implementation of a Government Innovation Voucher Application Analysis System
关键词：	自然语言处理技术多维度分析反作弊
外文关键词：	Natural Language Processing Technology Multidimensional Analysis Anti-cheating
论文摘要：	︿为了响应自主创新的国家发展战略以提高全社会的创新积极性，北京市推出了将创新券以创新劵的形式为创新型项目提供资金支持的政策。其中项目申请书的审核是保证创新券能够有效利用的关键。随着今年支持力度的增大，申报项目的数量必然会有大幅度的增加。每年项目申报数量持续稳定地增长，人工审核已经不能很好地满足保质保量的要求。为了提高审核人员的审核效率，本文提出了一种基于自然语言处理技术从多维度分析创新券项目申请书的方法，为审核人员提供审核指标，在减轻审核人员的审核难度的同时提高了审核人员的审核质量。本文从审核文档的需求出发，设计并实现了项目申请中的反作弊功能及项目申请书文本分析功能。反作弊的功能主要是使用SimHash算法将新申报的项目申请书比对历史申请书进行相似度查重，并根据定义的作弊行为制定规则计算作弊评价得分。根据评价得分，给出不同的响应。申请书的多维度分析功能是利用自然语言处理技术对项目申请书的文本内容从热度、新颖度、学术表现以及发展趋势进行分析。结合知网、万方和百度学术等外部资源，本文创新性地提出了申请书的热度指数、新颖指数、学术指数以及发展指数作为项目申请书的多维度分析的指标。本文研究工作的主要贡献在：反作弊功能为在大量的历史项目申请书中比对新申报项目申请书是否存在作弊行为提供了可能，这种工作是很难通过人工来完成的；审核人员根据多维度分析的四个指标，能够在没有相关项目背景专业知识的情况下对申请书做出较为客观的评价，提高了审核质量。在本文的研究和与审核人员的共同努力下，实现了从多维度对创新券项目申请书的分析。通过计算机的辅助，减少了审核人员的工作量，即使在申报项目数量的爆炸式增长的情况下也能够有效地完成审核工作。﹀
外文摘要：	︿ In response to the national development strategy of independent innovation and improve the enthusiasm of the whole society, Beijing has launched policies that government supports innovation project with a government voucher in the form innovative coupons. The audit of the project application is the key to ensuring that government funds can be used effectively. One the one hand, with the increase in support this year, the number of declared projects will inevitably increase. One the other hand, as the number of projects reported each year continues to grow steadily, the ability to manually audit has been unable to meet the quality and quantity requirements. In order to improve the audit efficiency of the auditors, this paper proposes the method of analyzing the application form of the government voucher project from the multi-dimensional basis based on the natural language processing technology, and provides the auditors with the auditing indicators to improve the auditing personnel quality. This paper designs and implements the anti-cheating function and multi-dimensional analysis function of the project application from the requirements of the audit document. Anti-cheating function mainly uses the SimHash algorithm to apply the newly declared project application to the similarity of historical application, and to calculate the cheating evaluation score according to the defined cheating behavior. According to the evaluation score, anti-cheating function gives a different response. When the score is greater than or equal to 5, project reporting behavior is defined as cheating behavior. The multidimensional analysis function of the application is to analyze the text content of the project application from the heat, the novelty, the academic performance and the development trend by using the natural language processing technology. In this paper, the heat index, the novelty index, the academic index and the development index of the application are proposed as the index of the multi-dimensional analysis of the project application. The main contributions of this paper are: it is difficult for humans to search for similar text in a large number of texts. The anti-cheat module solves this problem and can detect cheating behavior when reviewing the application. The auditors who were absence of relevant background knowledge of the project background, based on multi-dimensional analysis of the four indicators to make a more objective evaluation of the application to improve the quality of the audit. With the concerted efforts of the relevant staff, this paper has implemented the multi-dimensional analysis of the government voucher project application. Through computer assistance, the workload of the auditors is reduced, and the audit work can be effectively performed even in the case of explosive growth in the number of declared items. ﹀
分类号：	TP
论文总页数：	62
参考文献总数：	35
馆藏号：	017/M2017(467)
公开日期：	2017-05-21

基于自然语言处理技术识别假新闻的研究.黄颖彪

链接

题名：	基于自然语言处理技术识别假新闻的研究
姓名：	黄颖彪
学号：	1401210585
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工学硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
关键词：	文本分类模型假新闻深度学习
论文摘要：	︿在信息发达的网络化时代，人们往往通过互联网来获取新闻资讯。不过，网上的新闻资讯良莠不齐，有一些假新闻混杂其中。假新闻通过社交网络的扩散，会造成事实被扭曲，舆论遭到误导的影响，并对正常新闻报道的公正性产生严重的冲击。因此识别假新闻成为了很多媒体信息机构的重要任务。当前对于假新闻的研究表明，识别假新闻的主要问题在于：1）很多研究是从人文社科的角度来定性分析假新闻的特征，很少提出完整的自动化系统解决方案；2）很多识别假新闻的处理，都是针对某部分类别的假新闻，采用冗杂的特征工程方法来进行识别，需要研究者具有大量的先验知识；3）工业界对假新闻的识别一般采取过滤新闻来源的方法，这种方法，需要大量的人工介入，人们需要审核各类新闻网站的内容，并不断更新假新闻来源列表，面对海量的新闻来源，人工筛选不但消耗成本巨大，而且也难免会有遗漏。近年来，自然语言处理技术在文本分类任务上取得了不错的发展，可以考虑采用自然语言处理的方法来对假新闻进行识别。不过，假新闻和真实新闻的特征构成比较相似，可能会出现文本特征重叠现象，影响分类效果。本文通过实验发现，在识别那些模仿真实新闻口吻、不含情感色彩报道的假新闻任务中，基于现有的训练方法，识别准确率并不理想。这类假新闻更适合引入世界知识来判断。本文的工作主要在于，创造性地将情感分析和文本分类方法结合，并引入到假新闻的识别任务中，只针对那些适合自动化判别情感强烈的新闻进行训练。从新闻“爆炸性”和“煽动性”特征出发，借助word2vec词向量化方法和引入关注度机制，针对性地保留了假新闻的特征。并基于深度神经网络学习框架，设计出针对带有情感倾向的假新闻文本分类识别模型。最后，通过爬取真实的假新闻数据，对文本分类过程中的各个阶段进行传统方法与本研究方法的实验对比分析，证实了本文研究提出的模型在识别带有情感倾向的假新闻任务中具有可行性。﹀
分类号：	TP3-0/G21
论文总页数：	59
参考文献总数：	0
馆藏号：	017/M2017(481)
公开日期：	2017-05-21

关于中文输入法准确率方面的研究.郑静

链接

题名：	关于中文输入法准确率方面的研究
姓名：	郑静
学号：	1401210880
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
关键词：	输入法语言模型准确率翻译模型词性
论文摘要：	︿目前，中文拼音输入法主要采用基于词语的N-Gram语言模型（简称语言模型）计算长句候选。在这种方式下，输入法的准确率虽然有一定的保证，但都需要一个占据空间较大的语言模型作为前提，此外，要想进一步提高准确率也比较困难。在此背景下，本文提出了三个关于中文拼音输入法准确率方面的问题：（1）如何在使用较低存储空间的同时保证准确率？（2）如何在基于词语的N-Gram语言模型的基础上进一步提高准确率？（3）如何采用全新的技术进一步提高准确率？针对上述问题，本文结合数据压缩、剪枝、词性标注、以及深度学习等相关技术，提出了三个相应的解决方法，并通过实验证明了方法的有效性。（1）基于文件大小的语言模型剪枝与数据压缩。我们提出了一种方法，利用该方法可以找到准确率相对较高，而所占空间相对较小的基于词语的N-Gram语言模型。该方法通过指定文件大小和文件大小误差，然后自动得到相应基于词语的N-Gram语言模型。通过计算，我们得到各个期望文件大小所对应的准确率，基于此找出最佳的一组搭配，即达到准确率相对高，语言模型又相对小的目的。（2）基于词语语言模型和词性语言模型提高准确率。我们提出了两种将词语语言模型和词性语言模型结合起来以提高准确率的方法，第一种是先利用词语语言模型得到多个候选，再利用词性语言模型在候选词中计算得到唯一的最佳候选。第二种是把词语转移概率和词性转移概率结合在一起，计算得到最佳候选。（3）基于深度学习的翻译模型提高准确率。我们在现有基于深度学习的翻译模型的基础上，提出了一种适用于拼音到词/句转换的语言模型。改进后的翻译模型可以保证拼音串中每个拼音和词/句中每个字一一对应，从而达到输入拼音，输出最佳候选词/句的效果。通过以上三个实验，我们得到了以下结论：方法一得到的最佳词语语言模型文件大小为18.7M, 准确率为71.35%；通过方法二，输入法准确率平均提高了0.6125%；通过方法三，输入法准确率平均提高了1%。﹀
分类号：	H087/TP391
论文总页数：	70
参考文献总数：	0
馆藏号：	017/M2017(515)
公开日期：	2017-05-21

最佳教学实践指导下的英语听力学习系统的前端设计与实现.杨超

链接

题名：	最佳教学实践指导下的英语听力学习系统的前端设计与实现
姓名：	杨超
学号：	1401210805
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	The Front End Design and Implementation of an English Listening Learning System Under the Support of Teaching Practice
关键词：	听力学习系统学习策略前端设计前端优化
外文关键词：	Listening learning system learning strategy front-end design front-end optimization
论文摘要：	︿近年来，随着智能手机的普及和发展，无线互联网技术和智能终端逐渐延伸到了社会生活的各个领域，移动学习作为一种全新的学习方式逐渐走进人们的视野。前端技术的飞速发展为移动学习提供了新发展机遇，以HTML、CSS和Javascript为基础的Web技术使得前端开发打破了平台的障碍，让前端开发人员能够快速开发出适用于移动学习的app，并且能得到近似原生app的体验效果，为学习系统的前端开发提供了技术保障。本文的项目实践在此基础上研发了一套移动听力学习系统。有别于传统的英语教学环境与思路，本学习系统旨在研究互联网移动APP环境下英语听力能力的培养。需要指出的是，虽然移动学习并非全新的学习范畴，但仍然需要符合学习规律和教学规律，本文仅以学术界大多数人认可的教学理论实践为指导，在此基础上设计符合英语听力教学规律的学习系统，将教学理论与思路在前端予以呈现，在移动终端实现英语听力学习的智能化，对目前的听力学习产品做出改进，对传统方法做出改进和突破。由于该系统创新性较强，在理论研究和技术层面临了各种各样的问题，本文选取其中具有代表性的两个问题进行了详细论述。通过对听力学习的需求和现状分析以及目前市场上同类产品的调研，发现大部分学习软件只专注与功能开发，不注重听力教学法与学习系统的结合，造成其功能单一，界面混乱。因此，本文从学习方法设计和前端界面实现两个角度展开问题叙述，同时探讨了前端性能优化的问题。在系统的学习方法设计上，本系统以美剧为学习材料，针对移动学习特点，以情境教学法为指导，对听力学习方法进行了学前引导、预习、学中、复习四个阶段的划分，并在各阶段结合听力教学策略，对听力学习方法给出了详细的设计方案。在系统的前端界面实现上，结合学习方法的设计要求，本文给出了两套实现方案，以认知负荷与教学问题的解决为衡量标准，通过实验论证分析，验证最佳实现方案。在系统前端优化问题上，对播放器的优化设计方案进行了具体的分析研究，针对模块管理与调度、插件注册与管理给出了具体优化方案，并通过GTmetrix测试工具验证其前端性能优化效果。通过上述前端设计、实现与实验论证，本文得出了有益于听力教学的最佳方法结论，同时验证了学习系统有关前端优化设计的有效性，不仅填补了移动端英语听力学习系统前端研究与讨论的空白，也为移动学习系统的前端优化提供一些有益的参考和经验。﹀
外文摘要：	︿ In recent years, with the development and popularization of smart phones, wireless Internet technology and mobile terminals have gradually extended to every field of our social life. As a new way of learning, mobile learning is gradually coming into our sight. The rapid development of front-end technology provides a new opportunity for mobile learning, based on Web technology, the front-end development can support multiple platforms, so developers can quickly develop mobile learning app with good user-experience, which provides technical support for the front-end development of mobile learning. Based on the project practice, this paper develops a mobile listening learning system. To realize the intelligence of English listening comprehension in mobile terminals, the learning system aims to put the latest English listening teaching theory and strategy into practice. Due to the great innovation, the system faces a variety of problems in theory and technology, two representative problems are discussed in detail in this paper. Through the analysis of the demand and current situation of listening as well as the research of related products on the market, it can be found that most learning apps only focus on function development and ignore the combination of teaching method and learning system, which may result in many problems both in function and interface. Therefore, the paper discusses the problem from two aspects: learning method design and front-end interface implementation. Besides, the optimization of front-end performance is also discussed in the paper. In the design of learning method, the system chooses American drama as learning materials. According to the characteristics of mobile learning and the guide of situational teaching method, the learning method can be divided into four stages: Guidance, Preview, Learning and Review, detailed design is given in each stages combined with the listening teaching strategies. In the realization of front-end interface, to solve the problem of cognitive load and teaching, two implement solutions are provided according to the design requirements of learning method. The optimal solution will be verified through experimental analysis. In the optimization of front-end performance, the optimal design of the player is discussed in detail, which includes the management and scheduling of modules, the registration and management of plugins. The optimization effect of front-end performance is verified through GTmetrix. Through the front-end design, implementation and experiment, the paper reaches a conclusion about best learning method which is beneficial to listening teaching, and confirms the effectiveness of the optimized front-end design. These achievements not only fill the blank of front-end research on mobile listening learning system, but also provide some useful references and experience of front-end optimization for mobile learning system. ﹀
分类号：	TP311.52
论文总页数：	66
参考文献总数：	42
馆藏号：	017/M2017(529)
公开日期：	2017-05-21

基于PGIS的某市警务信息研判系统的设计与实现.刘聪

链接

题名：	基于PGIS的某市警务信息研判系统的设计与实现
姓名：	刘聪
学号：	1301221562
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
关键词：	PGIS 警情研判数据挖掘辅助决策
论文摘要：	︿智能警务工作是现阶段社会中，通过合理的运用现代科技，将公安系统中所涉及到的各类信息资源进行整合收集，并将公安系统中各业务模块进行整合从而有效的提高公安团队建设及执法能力的提高。其出现的关键点在于汇总人的智慧信息，并将其赋予人所使用的工具，从而使人与拥有智能的物互存互动、互补互促，以实现公安效益最优化。它标志着公安信息化正在走向数字化、网络化、智能化的高度融合。在这样的背景下,公安部门对于网络警情的掌握显得尤为重要。随着公安部门“实施科技强警战略、建立公安情报信息系统”的目标提出,公安网络警情分析研判与预警系统的建设需求空前迫切。当前，警用地理信息系统是集地理信息系统、遥感、GPS、计算机通讯技术、互联网技术以及媒体等多种信息化手段的应用实例，同样也是公安系统在日常工作中必不可少的工具。它的作用在于可以直接将用户所了解的信息以地图定位的形式表现出来，同时也包含其所有的相关信息。因此，其在公安系统中的广泛程度可见一斑。本论文结合警用地理信息系统与公安部门警情分析研判系统需求,对公安情报工作中的网络警情分析研判的相关技术进行研究,并结合公安部门已有的“公安情报信息综合平台”,探讨了警情分析研判与预警系统的设计与实现。论文首先分析了系统的研究背景与意义,介绍了网络警情情报的采集和分析研判等基于网络警情分析的公安情报工作的相关技术。对系统的总体需求、业务流程和功能性需求进行了分析,同时给出了相应的用例图进行说明;接着给出了系统的总体设计方案,系统设计中提出了设计的原则和要求,对整个警情分析研判与预警系统的总体设计进行了描述,确定了系统的技术架构和功能架构,对应用系统进行了架构设计。然后分别对警情分析研判中的警情分析、警情查询、警情上传、报表统计等子系统进行了功能设计。本文遵循软件开发设计的思想，从需求分析、系统设计、开发环境配置、系统实现等方面，详细阐述了基于PGIS警务信息研判系统的开发与实现过程。﹀
分类号：	TP319
论文总页数：	55
参考文献总数：	0
馆藏号：	017/M2017(581)
公开日期：	2017-05-21

搜索引擎查询短语中的命名实体识别方法研究.马胜节

链接

题名：	搜索引擎查询短语中的命名实体识别方法研究
作者：	马胜节
学号：	1401210678
专业：	计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2017-05-21
关键字(中文)：	自然语言处理命名实体查询短语
文摘：	︿近些年来，基于统计的方法在自然语言处理各方面都取得了很好的发展，在选择到合适的模型后，基于统计的方法可以在取得较满意的效果的基础上更加方便快速自动的完成任务。本人在一所互联网公司的的搜索研发部实习，命名实体识别是对查询请求进行处理的比较关键的一步，如果能正确的识别出查询中的命名实体，将更快的锁定用户的查询意图，更准确的将用户需要的信息召回，大大的提升用户体验。本人实习所在部门主要负责O2O 产品中的用户搜索请求，本文主要研究如何更好的识别出这些用户搜索请求中的实体词。需要识别的命名实体共有3 类：地址词、分类词、和品牌词。从用户的搜索日志来看，相较于其它搜索引擎产生的查询日志,本文要进行命名实体识别任务的搜索查询具有查询信息更短，且一条查询中实体词更加密集的特点。本文利用历史用户搜索日志，借鉴了前人从搜索日志中挖掘实体词的方法，从日志里分别挖掘出了3 类实体词然后保存成词典。在从搜索日志挖掘实体词时，本文从搜索日志普遍较短的实际情况出发，采取了根据长度和频次特征结合选取候选命名实体的方法，并取得了很好的效果。本文总结了根据日志挖掘实体词的一般方法，该方法可以定期执行以挖掘新出现的实体词。本文主要使用条件随机场模型完成命名实体识别任务，对于条件随机场模型，特征的选择对识别的效果影响很大，通过特征选择来提高识别的结果是本文的主要目标。本文在条件随机场模型中用到的特征有，基本特征，近义词特征，词向量聚类特征，实验表明，近义词特征，词向量聚类特征均可以提升模型的识别效果。本文利用触发相同的点击的数目来衡量查询的相似性，生成近义词表，并使用是否有近义词在词表中做为近义词特征。本文使用同一个会话过程中的查询作为语料生成词向量，并利用词向量聚类后的类别标签作为模型的输入特征，实验表明，使用多次聚类并将不同方法聚类产生的结果作为特征放进模型进行训练，可以使结果获得一定的提升。本文最后尝试使用循环神经网络进行命名实体识别实验，并对比了与条件随机场模型的效果，理论上循环神经网络更有优势，但在本次实验中循环神经网络表现不佳，未来希望尝试更多的实验。﹀
分类号：	TP39
论文总页数：	50
参考文献数：	0
馆藏号：	017/M2017(638)
公开日期：	2020-05-21

对话式交互中问答系统的设计与实现.岳聪

链接

题名：	对话式交互中问答系统的设计与实现
作者：	岳聪
学号：	1301211063
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2017-05-21
题目(外文)：	Design and Implementation of Question Answering system in Conversational UI
关键字(中文)：	自动问答排序学习问题答案类型自然语言处理
文摘：	︿对话式交互，作为命令行交互、图形界面交互之后，又一里程碑式的交互方式，在过去一年里如火如荼地发展。而自动问答系统由来已久，它旨在给定一个自然语言形式的问题，系统能够返回一个对于该问题的精准答案。语音交互中系统得到的输入，一般是用户为寻求一些信息和服务，同时用户随性提问的问题的准确回答，将极大地提高用户的惊喜程度和产品黏性，是提高用户每日活跃度、丰富用户体验和一个创业公司技术积累的有效手段。本文从无到有，搭建了一整个对话式交互产品中的自动问答系统。从答案类型(Lexical answer type, LAT)的识别，到后面答案的学习排序，加上数据的标注、知识图谱建设和扩充、LAT-实例等的挖掘，实习期间独立搭建了整个自动问答系统的流程、其中包含很多小任务的组合。上线运行的系统，在真实的用户问题和一站到底数据集上准确率达到了94%，并且在工程方面通过多线程并发等手段优化了每条问题回答的时间。本文的创新点主要包括以下几个方面： 1．将LSTM应用于智能移动助手中用户问答意图的探测 2. 用机器学习方法而非规则识别一个问句的答案类型 3. 提出如果问句没有显式的答案类型，则根据问句的焦点词做答案类型的推断 4. 比较各种排序学习算法在中文答案排序中的效果，并试验了组合算法 5. 把基于知识图谱的多轮问答问题抽象为一个搜索问题，借鉴剪枝的思想提高答案的准确率 6. 在对话式交互系统中，嵌入问答系统，有能力和其他信息搜索式的任务、服务提供式的任务，在语音对话式交互中无缝切换﹀
分类号：	TP
论文总页数：	53
参考文献数：	51
馆藏号：	017/M2017(667)
公开日期：	2020-05-21

基于HBase的HAWQ查询优化研究与实现.谢钧涛

链接

题名：	基于HBase的HAWQ查询优化研究与实现
姓名：	谢钧涛
学号：	1401210787
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
关键词：	HAWQ HBase 数据引擎查询优化分布式系统
论文摘要：	︿ Apache HAWQ 是一款 SQL-on-Hadoop 新型数据分析引擎，它基于 MPP 架构实现分布式查询，通过高度并行化和数据本地化策略实现了数据高效存取，依托 HDFS 存储，实现了传统数据库所不具备的线性可扩展性，并且能够提供完善的 SQL 支持和数倍于其他SQL-on-Hadoop数据分析引擎的分析速度。Apache HBase是一款基于Hadoop 的 NoSQL 数据存储，它是谷歌 BigTable 的一种开源实现，提供了高效可扩展的数据随机访问，广泛应用于海量数据的查询优化。一些企业用户使用 HBase 作为他们热数据存储，存在着 HAWQ 直接对 HBase 数据存取访问的企业需求。结合在易安信（EMC）研发集团和某研发数据分析引擎公司的实习经历，笔者发现在进行大规模数据处理的场景中，HAWQ 进行单点查询和小范围查询存在较大的优化空间。为了提供 HAWQ 对 HBase 数据访问的支持，且利用 HBase 随机存储特性改善 HAWQ 的查询效率，本论文提出了一种基于 HBase 存储的 HAWQ 查询优化系统。该系统将通过使用 HAWQ 外部表实现对 HBase 数据直接存取，并参考 HAWQ 已有的数据分片方式，结合 HBase 表数据分片特点进行创新，设计本系统的分片逻辑，达到查询并行化的目的。围绕以上设计目标，本论文主要做了以下工作：1、调研 HBase 与 HAWQ 相关特性，研究其他类似项目的实现方法；2、利用 JNI 封装的 HBase API 实现格式转换器，用于 HAWQ 对 HBase 的数据访问；3、设计并实现 HBase 分片逻辑和 HBase 数据读写控制逻辑；4、使用 YCSB 数据集测试本系统功能与性能，并与 Hive 进行对比。在相同数据集上的对比测试中，本论文实现的 HAWQ 查询时间要比 Hive 减少三到数百倍。﹀
分类号：	TP311.133.1
论文总页数：	71
参考文献总数：	49
馆藏号：	017/M2017(671)
公开日期：	2017-05-21

分布式实时流处理系统的性能和可靠性的研究和优化.吕云松

链接

题名：	分布式实时流处理系统的性能和可靠性的研究和优化
姓名：	吕云松
学号：	1401210672
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	The Research and Optimization of Reliability and Performance for Real-time Stream Processing System
关键词：	分布式大数据流处理系统可靠性性能
外文关键词：	Distributed Big Data Stream Processing System Reliability Performance
论文摘要：	︿随着互联网、物联网的快速发展，数据产生的速度显著加快，而且越来越多的应用场景，例如实时广告推荐系统、在线交易系统、社交网络、城市交通监控等，要求对海量数据进行实时的处理分析。传统的批处理系统，如MapReduce已经不能满足这些应用的需求。因此，许多高校、企业和开源社区都加大了对分布式实时流处理系统的研究。然而实时的流处理比传统的批处理所面对的问题更加复杂，这是因为批处理系统的输入数据是静态的、有界的，而实时流处理系统的输入数据是动态的、无界的，且应用场景往往需要提供24×7的在线持续服务。这就要求用新的计算模型来解决实时流处理面临的问题。当前有两类计算模型，一种是对每条消息进行实时处理的计算模型，也被称为原生的流处理模型，基于此类模型的流处理系统如Apache Storm和Apache Apex；另一种是BSP（Bulk Synchronous Processing）计算模型，对消息进行微批（Micro Batch）处理，典型系统如Spark Streaming。无论是哪一种模型，都面临许多技术挑战，如降低端到端的处理延时为次秒级、提高吞吐量、出现故障后依然能够保证数据被正确处理等。本文主要对分布式的实时流处理系统的性能与可靠性方面的相关问题进行研究和优化。本文主要工作包括：1、对当前主流的流处理模型和系统进行总结；2、改进了Yahoo! Streaming Benchmark，设计了针对可靠模式下的通用评测指标，能够更加准确地评测试实时流处理系统的性能，尤其是延时；3、对Apache Storm核心的可靠性保障算法进行了研究，提出了新的可靠性算法，即Split-Aggregate算法，并在Storm 1.0.1上实现并评测了该算法，并将它与Storm原有的可靠性算法进行了对比，它减少了对CPU和网络的使用，降低了延时，提高了Storm的性能；4、研究Apache Apex的可靠性机制对其性能的影响，提出了多种改进其可靠性过程的方法，如全异步并行快照的方法以减小同步阻塞导致的延迟，基于内存数据库来减少磁盘I/O的开销，并在Apex 3.4.0上实现了上述方法；5、利用改进的Yahoo! Streaming Benchmark，对使用了新方法的Apex和原有的Apex进行了性能对比测试，结果表明，新方法能够降低端到端的延时，提高Apex的性能。﹀
外文摘要：	︿ With the rapid evolution of Internet and IoT, the velocity of new data generation and growth is significantly faster than ever before. Nowadays more and more use scenarios, such as Advertising Recommendation System, Online Trading System and Social Networking and Traffic surverillance, require real-time processing and analytics towards massive amounts of data. The traditional batch processing system, typically MapReduce, turns to be unable to meet such demands. Therefore, universities, industrial circles and open source communities have been investing a lot on distributed real-time stream processing system. However, real-time stream processing is much more complex than traditional batch system. This is because the input data of the batch system is static and bounded, while input data of real-time stream processing system is dynamic and unbounded. In addition, real-time stream processing system usually needs to be capable of 24 * 7 continuous online running. All these needs call for enhanced and optimized computing model. There are two types of computational models proposed, one is a record-at-a-time streaming model, which is also known as the native stream processing model and typical systems include Apache Storm and Apache Apex, the other is BSP(Bulk Synchronous Processing) model, which processes data in micro batches fashion and typical system like Apache Spark. No matter what kind of model, still there are tough technical challenges ahead, such as minimized end-to-end processing latency to sub-second level, improved throughput and guaranteed data processing even in case of component or system failure. Towards above challenges in modern distributed real-tim stream processing systems, including numbers of Apache Top-Level projects like Apache Strom, Flink, Apex and Spark Streaming, the paper focuses on deep research and creative improvement of system reliability and performance. Our primary contributions or research work include: 1. Summarize the current mainstream stream processing model. 2. Improve Yahoo! Streaming Benchmark and design a general-purpose benchmark, which is able to evaluate the performance of stream processing systems, especially latency. 3, Research the guaranteed message processing algorithm of Storm, propose a new guaranteed message processing algorithm called Split-Aggregate and implement it in Storm 1.0.1. 4, Research the impact of reliability mechanism on its performance for Apex and propose two kinds of methods to improve its performance in reliable processing mode, such as pure asynchronous checkpoint to significantly eliminate blocking overhead, distributed in-memory DB for fault-tolerance rather than costly disk I/O, and implement those methods in Apex 3.4.0. 5, Compare the performace for Split-Aggregate algorithm and Storm’s original guaranteed message processing algorithm, the results show that Split-Aggregate algorithm reduces the use of CPU and network bandwidth. 6, Compare the new methods proposed for Apex and Apex’s existing methods with our improved Streaming Benchmark, the results confirm their superiority comparing to existing method. ﹀
分类号：	TP3
论文总页数：	65
参考文献总数：	0
馆藏号：	017/M2017(674)
公开日期：	2017-05-21

搜索广告中非对称先验的有监督LDA模型的设计与实现.章玲通

链接

题名：	搜索广告中非对称先验的有监督LDA模型的设计与实现
姓名：	章玲通
学号：	1401210867
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
关键词：	搜索广告非对称先验有监督LDA 意图分类
论文摘要：	︿搜索广告通过查询词从广告库中筛选出相关的广告展示给用户，深入地分析用户的查询意图是搜索广告的重点，本文基于在微软必应广告组的实习工作，建立适用于搜索广告的查询意图多分类器，提高广告点击率的同时提升用户的搜索体验。由于查询文本短、文字歧义大、长尾现象严重，查询词和广告文本的相关性计算始终是一个难题，而短文本扩展带来的噪音直接影响最终的准确性。查询意图分类面临着如下问题：分类体系如何构建；分类方法在大规模分类体系和数据上的准确性与可靠性。当前针对大规模自然语言处理中主题模型LDA(Latent Dirichlet Allocation)的一些分布式算法都是基于各自的分布式平台，虽然整体性能有了很大提升但可移植性差。针对上述问题，本文从以下三方面给出解决方案：首先，提出非对称先验参数的有监督LDA模型。对于文本过短问题，采用查询扩展的方式，将必应搜索结果的前10条文档摘要作为扩展文本。对于噪音问题，通过训练一个有监督的LDA模型，让分类标签和LDA的隐式主题一一对应，规定短文本中所有词共享一个主题，扩展语料可以产生多种主题，使扩展语料中的噪音词分到与短文本无关的主题中，减少噪音的影响。其次是意图多分类体系的构建，本文采用谷歌的广告分类体系，共有3180个类别标签，来源于谷歌对内部的广告数据进行聚类分析，再由人工进行调整，最终形成针对广告系统的分类体系。最后是对上述的大规模离线数据进行大规模LDA的分布式训练。对Gibbs Sampling算法进行优化，将时间复杂度从O(K)降低到O(min(K_m,K_t))，其中K表示主题的数量，K_t表示词t的语义种类数，K_m表示文档m的语义种类数，K_m和K_t均远小于K。将优化的采样算法应用到Spark分布式框架中，利用GraphX将模型用图进行表示。本文的创新有两点：第一，通过非对称先验的有监督LDA有效地解决短文本查询扩展中的噪音问题；第二，改进现有的Gibbs Sampling算法，将其应用于大规模的有监督主题训练中，解决可移植性差的问题，并且速度较LightLDA有2-4倍的提升。本文将基于有监督LDA的意图分类与SVM进行对比，准确率和召回率都有了一定的提升。同时我们的相关性模型在必应搜索引擎的实际应用中有效地提高了展示广告与查询词的相关性，并对广告的点击率有了一定的提升﹀
分类号：	TP3
论文总页数：	66
参考文献总数：	0
馆藏号：	017/M2017(694)
公开日期：	2017-05-21

基于深度增强学习的多轮对话系统设计与实现.徐粲

链接

题名：	基于深度增强学习的多轮对话系统设计与实现
姓名：	徐粲
学号：	1401210792
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	Design and Implementation of the Multi Turn Dialog System Based on Deep Reinforcement Learning
关键词：	策略梯度深度学习增强学习多轮对话
外文关键词：	Policy gradient Deep learning Reinforcement learning Multi-turn
论文摘要：	︿在多轮开放域对话中，怎么能够理解当前对话的上下文语境做出正确的应答是实现多轮对话系统的关键。相对比单轮对话系统，多轮对话拥有更高的复杂度，当前的回复是否需要考虑上下文，需要考虑上下文的哪些信息等等问题都需要解决。当前针对多轮对话系统的构建主要有两大类方法，一种是基于匹配模型的，通过构造上下文衔接特征训练机器学习模型从候选集中筛选出能够构成对话回复的语句。另外一种方法是基于生成的方法，这种目前方法主要是基于深度学习的seq2seq方法，直接根据上下文跟回复的成对语料，通过End-to-End的方法直接学习上下文跟回复的对应关系。随着增强学习方法热度不断提高，在单轮开放域对话的生成中，最近也有研究者开始尝试使用增强学习的方法，主流的做法有两种，一种是利用预先定义好的奖赏函数(Reward Function)对生成的对话进行评分；另外一种做法就是采用GAN(Generative Adversarial Networks)的方式通过一个神经网络模型来作为奖赏函数。本文主要工作就是设计和实现了一个使用深度学习和增强学习方法的多轮对话系统，该对话系统是检索式和生成式混合的系统。在检索式方法中使用基于深度学习的方法计算query和query的语义相似度，寻找到最佳的匹配；在生成式方法中采用深度增强学习方法首次将开放域多轮对话匹配模型的state-of-art模型SMN(Sequential Match Network)作为奖赏函数，并提出突出语义重要性的策略梯度方法，该方法通过区分句子中的模板成分和语义成分采用不同的权重进行参数更新，提升了生成的对话回复质量。经过实验验证，跟基线系统HRED(Hierarchical Recurrent Encoder Decoder)比起来，本文所提出的系统架构能够在回复的相关性和多样性上面有明显的进步。系统的评测主要分为两个方面，一种是自动评测，通过PPL(困惑度)指标，另外一种是人工评测。实验结果表明本系统在PPL上提升了11.5%，在人工评测指标上31%的query回复与HRED持平，48%的query能够超过HRED，只有21%的query比HRED差。﹀
外文摘要：	︿ How to understand the context of the current dialogue is the key points to realize multi-round dialogue system in the open domain. Compared to the single round dialogue system, multi-round conversations have a higher degree of complexity, the response need to consider if we need the context or which parts of context we need. There are two kinds of methods to build the multi-round dialogue system, one is based on matching model, by using machine learning model constructed with context cohesion features to select to response from the candidate reply dialogue. Another method is to generate the response based on the seq2seq method which directly generate response word by words. As reinforcement learning become more popular, in the area of single round open-domain dialogue, recently researchers try to use reinforcement learning to improve generate quality. There are two main methods: one is to use user-defined reward function to evaluate the response; the other one is to use a neural network as reward function to evaluate response like GAN(Generative Adversarial Networks) . The main work of our paper is to design and implement a multi-round dialogue system, which is based on deep learning and reinforcement learning, and also this system is a hybrid system of retrieval and generation. In the retrieval method, we used deep learning tool to compute semantic similarity of query and query to find the best match. In reinforcement learning we first introduce the state-of-art match model SMN as the reward function, for policy gradient methods we improved it using the semantic importance. The experimental results show that the architecture proposed in our paper can significantly improve the relevance and diversity of replies compared to baseline model (HRED). The evaluation system is mainly divided into two aspects, one is the automatic evaluation, evaluation by PPL score and several rounds of dialogue, the other one is artificial evaluation. The experimental results show that the dialogue generation system is better than that of the HRED system in PPL for 11.5%, and by human evaluation on 31% query our system is the same with HRED system; on 48% query our system is better than HRED system ;on 21% query our system is worse than HRED system. ﹀
分类号：	TP3-05
论文总页数：	60
参考文献总数：	0
馆藏号：	017/M2017(696)
公开日期：	2017-05-21

基于句法分析的英语型式自动识别.刘潇杨

链接

题名：	基于句法分析的英语型式自动识别
姓名：	刘潇杨
学号：	1401210653
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工学硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	Automatic Identification of English Patterns Based on Syntactic Parsing
关键词：	英语型式自动识别句法分析型式层级
外文关键词：	English Patterns Automatic Identification Syntactic Parsing Pattern Hierarchy
论文摘要：	︿传统的英语研究及英语教学往往将词汇与语法视为两个独立的领域。这种做法会导致研究者和学习者无法将语法知识应用到具体词汇，从而无法真正地掌握语言的使用方法。型式是将词汇与其实际使用结合起来的一种概念，如“agree to do”就是agree的一种型式。作为词汇与语法的交界点，型式的提出很好地解决了词汇与语法过于分离的问题。为了更好地应用型式，需要通过型式识别得到更加可靠的型式资源。但是目前的研究中提出的型式识别方法都无法同时保证识别效率、准确率和全面性，而且识别对象都局限在了动词上。本文的核心工作正是在前人研究的基础上，提出了一种改进的基于句法分析的型式自动识别方法，在保证识别准确率和全面性的同时，提高了识别的自动化程度。该方法的创新点和对传统型式语法的突破点主要包括：1）改变了传统型式语法中以型式为中心，扩散出具有该型式的词汇的形式，转而从某个具体词汇出发，给出该词汇所具有的型式；2）基于完全句法分析，解决了浅层句法分析结果过于泛化的问题；3）不再遵循传统的型式表示方式，直接以句法分析的标签作为最终的型式输出，节省了从标签到传统型式元素的转换时间；4）使用多层级结构进行型式表示，使型式列表更加清晰，有利于从多个层面了解型式，也方便于日后根据不同的需求灵活更改型式的显示粒度；5）除了基本识别语料BNC外，还引入了《牛津搭配词典》和Google Syntactic N-grams中的部分资源，使最终的型式识别结果更加全面。上述识别方法的提出是以动词作为识别对象的，但是本文同时还在上述方法的基础上进行了扩展，提出了针对形容词和名词的型式自动识别方法，这在前人的研究中是从未涉及过的。提出识别方法后，本文利用这些方法，分别对5396个动词、697个形容词和1392个名词进行了型式识别。识别完成后，对其中频率较高的前100个动词的识别结果进行了专家检验，平均准确率大于96%，证明了本文所提出的方法的有效性。最终得到的型式列表，也是本研究的成果之一。最后，本文还基于提出的识别方法和得到的型式列表，对本研究在英语研究和英语教学中可能的应用方向进行了说明，并对其中的某些设想进行了初步探索，从而进一步证明了本研究的意义。﹀
外文摘要：	︿ In traditional English research and English learning, vocabulary and grammar are always treated as two separate fields. In this way, the researchers and learners cannot apply grammar knowledge to a specific word, so that they cannot really grasp the usage of English. Pattern is a concept that combines words with their actual use. For example, “agree to do” is a pattern of “agree”. As the intersection of vocabulary and grammar, pattern well solves the problem that vocabulary is separated from grammar. In order to better apply pattern, it is necessary to obtain more reliable pattern resource by pattern identification. However, the present pattern identification methods cannot achieve efficiency, accuracy and comprehensiveness at the same time, and are limited to the identification of verbs. On the basis of previous studies, this paper proposes an improved automatic identification method of pattern based on syntactic parsing, which is the core content of this paper. The method could achieve efficiency, accuracy and comprehensiveness at the same time. The innovation and breakthrough of the method mainly includes: 1) the method in this paper sets word as the core and lists all the patterns the word owns, instead of making the pattern as the core and listing all the words that owns the pattern in traditional pattern grammar,;2) the method is based on complete syntactic parsing which solves the over-generalization of shallow parsing; 3) the method doesn’t follow the traditional form of pattern representation, but directly uses the tags in the result of syntactic parsing to represent a pattern, which saves the time to convert from these tags to traditional pattern elements; 4) the method introduces multi-level structure in the pattern list, making the list clearer and easy to understand as well as making it more convenient to change the display unit of the list according to different kinds of command; 5) besides BNC, the method also introduces Oxford collocations dictionary and Google Syntactic N-grams to make the pattern list more comprehensive. The identification target of the above method is verbs, but this paper also proposes pattern identification methods for adjectives and nouns based on the above method, which has never been involved in the previous research. Using the method proposed, this paper identifies the patterns of 5396 verbs, 697 adjectives and 1392 nouns. Then the pattern lists of the 100 verbs which are most frequent used in the corpus are checked by experts and the average accuracy is above 96% which proves the effectiveness of the method. The pattern lists obtained from the process is also one of the achievements of this paper. Finally, based on the proposed identification method and the pattern lists, this paper also descrisbes some possible applications of this paper in English research and English learning， and makes some initial exploration on some of these possible applications, which further proves the significance of this paper. ﹀
分类号：	H087/TP391
论文总页数：	88
参考文献总数：	0
馆藏号：	017/M2017(702)
公开日期：	2017-05-21

术语自动抽取系统的设计与实现.石朋欣

链接

题名：	术语自动抽取系统的设计与实现
姓名：	石朋欣
学号：	1301221183
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
关键词：	专利术语识别信息抽取专利术语自动抽取设计与实现
论文摘要：	︿ “术语自动抽取”（Automatic Term Extraction）是自然语言处理领域，信息抽取的一个重要的应用方向，相对于分词、关键词抽取等任务，术语自动抽取以应用为导向，希望能够抽取结合应用场景的专业术语。本文专注针对专利领域的术语抽取算法和实现。为了解决专利术语抽取问题，本文构建了一套完整的专利术语自动抽取系统。用户输入任何一段专利文本，本系统能够基于前期的词典挖掘和文本分析，从中抽取潜在的专利术语。本文主要包括两个部分：专利术语自动抽取算法研究和专利术语自动抽取服务搭建。在专利术语自动抽取算法研究部分，本文分别研究了多种不同方法。基于词典的最大逆向匹配算法能够获得很高的准确率。基于专利术语特点的方法利用互信息和左右熵等词内和词间的特性判断短语属于专利术语的概率。基于词频分布的方法充分利用了同一个词在不同的领域会有差异的分布。基于话题模型的方法利用无监督的话题模型来学习不同词的话题分布特性。考虑到很大一部分术语来自外国科技和文化的引进，很多术语是来自翻译，因此本文还提出了基于机器翻译的专利术语提取方法。基于深度学习的方法，将专利术语提取转换为一个序列标注任务。专利术语的形成受到多方面的影响，每种方法都有其擅长的方面，本文将它们进行组合，优势互补，得到了一个灵活、平衡的、能适应多重情形的融合方法。在专利术语自动抽取系统搭建部分，本文搭建了一个网页端服务，供用户使用。还搭建了一个RESTful 风格的API 服务，供第三方应用调用。实验结果表明，本文提出的多方法融合的效果要优于单一方法的效果，并且能够更加灵活的适用不同的场景。完整的专利自动术语抽取系统，显示了本论文同时具有较强的理论价值和应用价值。﹀
分类号：	N04/TP391
论文总页数：	49
参考文献总数：	50
馆藏号：	017/M2017(711)
公开日期：	2017-05-21

面向自适应教学的英语口语资源加工方法的设计与实现.阙颖

链接

题名：	面向自适应教学的英语口语资源加工方法的设计与实现
姓名：	阙颖
学号：	1401210697
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	The Method’s Design and Implementation for Processing Spoken English Resource Based on Adaptive Learning
关键词：	自适应教学英语口语资源加工
外文关键词：	Adaptive Teaching Oral English Resource Processing
论文摘要：	︿美剧学习是提升英语口语能力的有效途径。然而，市面上现有的口语学习软件并未很好地将美剧资源用作教学用途。若要将美剧作为软件中的学习材料，需要优先解决如下四个问题：一是需要将美剧切割成粒度更小的片段，如果视频的播放时间过长，容易使学习者沉迷于剧情而忽略学习的目的；二是需要改进视频内容的编撰方式，优先让学习者学习高频的知识点；三是改进软件对教学内容的推送方式，使教学内容在软件的推送过程中符合由易到难的学习规律；四是借助技术手段辅助人工编写软件中需要用到的教学素材，提升编写效率和质量。为了解决以上问题，本文设计并实现了一套面向自适应教学的口语资源加工方案。设计目的是将美剧划分成不同难度等级的片段，用于辅助教材研究者编写教学素材或将其作为自适应学习系统的资源支持。本文设计的加工方案包括两个部分。一是切割美剧并做归类，二是从美剧中抽取口语知识点。针对第一部分，当前对于美剧的切割和分类，暂不存在自动或半自动的方法。本文设计了一个技术方案，通过识别剧本中特殊的场景标签切割美剧，并对切割好的视频按照主题归类。其中，主题归类需要依次进行语料预处理、特征词抽取、特征词加权、相似度计算、主题聚类等步骤。经处理的美剧被切割成时长合适且带有主题特征的场景片段，这些片段将自动地归类在相应的主题标签中。针对第二部分，目前尚未有成型的知识抽取方法。本文设计的方法能够从口语语料中抽取合适的词汇和搭配，并对抽取结果进行排序。其中，词汇排序考虑了单词长度、单词拼读难度、单词频率三个因素；搭配排序考虑了搭配中的词汇难度和搭配频率两个因素。排好顺序的词汇和搭配将与抽取的场景融合，共同参与场景难度计算。场景难度计算综合了词汇得分、搭配得分、场景语速三个因素，排序结果代表分配教学内容的先后顺序。基于以上加工得到的素材，本文设计了一个具体的教学应用场景，用来介绍如何将这些材料应用在实际的口语教学中。本文的有效性验证采用专家访谈的方式。通过向专家展示词汇、搭配、场景的排序结果、场景关键词、场景类别主题词，验证本文设计的有效性。访谈结果证明了本文设计的词汇、搭配、场景排序算法可行有效，场景关键词能有效辅助理解场景内容，场景类别主题词有助于辅助场景分类。﹀
外文摘要：	︿ Learning from American drama is an effective way to improve oral English. However, there is much room for current English-learning applications to process American drama as oral English learning materials. To better use American drama for teaching purpose, the following problems are prominent. Firstly, it’s easy for students to get lost to the long and fascinating story and ignore learning objectives. Therefore, cutting American drama from episode to short segment is a necessary step. Secondly, students cannot have easy access to high frequency words in current compilation sequence. Thirdly, the recommendation order of drama does not conform to easy-to-difficult language learning rules. Forthly, current compilation needs technical means to improve compliation efficiency and quality. To solve above problems, this paper designs and implements a resource processing scheme, which cuts American drama into segments with different difficulty levels. The output resource can be applied to compile oral English textbooks or supporting adaptive learning systems. The processing scheme includes two parts. One is to cut show episodes into scene segment and classify scene themes, and the other is to extract the points of knowledge from American drama. On the one hand, there doesn’t exist a method to cut and classify American drama semi-automatically or automatically. This paper innovatively designs a way to use scene tags from show scripts to cut videos automatically, which replaces the current time-cunsuming and low-efficiency manual work. After that, this paper further does unsupervised topic clustering. Topic clustering includes the following steps: feature extraction, feature weighting, text similarity calculation, and topic clustering. The processed scene segments have proper durations and are classified into relative theme labels. On the other hand, there doesn’t exist a mature method to extract knowledge from American drama corpus. This paper designs a method to extract the point of knowlege from drama corpus automatically. Apart from that, this paper takes the priority of the points of knowledge into consideration and designs algorithms to rank them. The word-sorting algorithm is based on the following three elements: word length, spelling difficulty, and word frequency. The collocation-sorting algorithm is based on the following two elements: word difficulty and collocaiton frequency. The scene-sorting algorithm is based on the following three elements: word priority, collocaiton priority, and scene speaking rate. The processed scenes, words and collocations will be provided to the oral English learners in accordance with the priority levels. Based on the processed mateirals, this paper designs a specific teaching scenario to introduce how to put the processed materials into practice. This paper also verifies the scheme validity through expert interviews by showing them the results of vocabulary rank, collocation rank, scene rank, scene keywords, and scene theme labels in sequence. The interview proves that the design of the vocabulary, collocation, scene sorting algorithm is feasible and effective; the scene keywords can effectively assist researchers to understand the scene content; the theme keywords can effectively assist researchers to classify scenes. ﹀
分类号：	H087/TP391
论文总页数：	82
参考文献总数：	42
馆藏号：	017/M2017(723)
公开日期：	2017-05-21

最佳教学实践指引下的英语词汇学习系统前端设计与实现.徐冉

链接

题名：	最佳教学实践指引下的英语词汇学习系统前端设计与实现
姓名：	徐冉
学号：	1401210795
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	The English Vocabulary Learning System’s Front-End Design and Development Guided by Optimum Teaching Practice
关键词：	词汇教学实践词汇学习系统前端设计前端性能优化
外文关键词：	Vocabulary teaching practice Vocabulary learning system Front-end design Front-end performance optimization
论文摘要：	︿近几年，互联网在线教育利用其自身优势为英语学习者提供了很好的支持，成为了一种重要的英语学习方式。在英语学习中，词汇学习起到了重要作用，因此本文对近三十年来的英语词汇教学实践进行了研究，并对行之有效的教学方法进行了总结。同时本文也对市面上同质词汇学习软件进行了调研，结果显示，目前市面上大多数词汇学习软件都只关注了“背单词”的过程，从而将词汇学习从整个英语学习环境中抽离了出来，既没有对科学的词汇学习方法进行深入探索，也没有结合语境完成词汇的输入输出过程；同时，这些软件的后台对教学实践的支持有限，无法做到正真意义上的“因材施教”。值得一提的是，在移动设备上学习并非新的学习范畴，系统设计仍然需要符合学习规律和教学规律，本文仅以学界大多数人认可的教学实践为指引，在此基础上设计符合学习规律的词汇学习系统，对目前的词汇学习产品做出改变。本系统是“看美剧学英语”项目的组成部分，由于此项目在移动端开发，因此系统在设计过程中还对传统词汇学习方法做出了符合移动系统情况的变通。这样的设计思路给系统的前端设计和实现带来了两个亟需解决的问题：一是语境中科学、弹性的词汇学习方法的设计问题，二是系统实现给前端开发带来的工程问题。本系统在教学实践的指引下，根据移动学习系统的实际情况设计了语境词汇引入、语境词汇复看和语境词汇输出阶段三个词汇学习阶段。在对此三个阶段进行具体学习方法设计时，存在同一种教学实践指引下的多种页面实现方式。针对这种情况，本系统将依据学习目标达成情况和认知负荷这两个指标对系统实现方案进行综合考虑，通过实验和数据分析的方式选出最优的前端页面实现方式。为了解决系统实现给前端开发带来的工程问题，本系统选择了在现有前端开发框架下，开发独立封装的公共方法，并且进行单独调用的插件式开发方式。同时，这种开发方式也十分符合web前端性能优化的原则，减少了词汇插件的插入对原系统所造成的负担。此外，本文还通过前端性能测试验证了本开发方式的优越性。﹀
外文摘要：	︿ Recently, online education provides the EFL students with large support and becomes an important way of learning English. Meanwhile, as vocabulary plays an important role in English learning, this paper studies English vocabulary teaching practice of recent 30 years and summarizes the effective teaching methods. At the same time, this paper researches the homogeneous vocabulary learning software. The research reflects that most vocabulary learning software only focusing on the process of ‘memorizing’ words, which keeping the vocabulary learning away from English learning environment. What’s more, the software never explores the depth of vocabulary learning and never complete the input and output process by the context. At the same time, the background program provides limited support for the teaching practice, which can't teach students in accordance of their aptitude. Meanwhile, learning via mobile is not a new learning category, which needs to conform to the laws of learning and teaching as well. This article follows the teaching practice which is widely accepted as the guidance to design the English vocabulary learning system on the basis law of learning to change of the current vocabulary learning products. This system is part of the ‘Learning English while Watching American Drama’ project. As the whole project is developed on the mobile, the system makes some changes to adapt the realistic condition. Such design brings the system two problems to solve: the first one is designing scientific and flexible vocabulary teaching methods in context, the second one is system’s front-end development problem. According to the characteristics of mobile learning system, this system combines the traditional teaching way and the mobile learning system, designing three stages of vocabulary teaching which accords with the characteristics of mobile learning system: the context words’ introduction stage, the context words’ revision stage and the context words’ output stage. While designing learning methods to the three stages, there remains problems of various interface under the guidance of same teaching practice. In this case, the system will take learning goal and cognitive load as two indicators and select the optimal implement through experiment and data analysis. In order to solve the problem of system’s front-end implementation, this system chooses to develop plug-in system, which implementing encapsulation public methods independently and calling functional models separately under the existing front-end development framework. At the same time, this development way is also in line with the web front-end performance optimization principles, which reducing the plug-in system’s burden to the inserted original system. In addition, this paper also verifies the advantage of this development way through the front-end performance test. ﹀
分类号：	H087/TP391
论文总页数：	82
参考文献总数：	3
馆藏号：	017/M2017(725)
公开日期：	2017-05-21

A省公安边防信息安全管理风险评估模型设计与应用.邵健健

链接

题名：	A省公安边防信息安全管理风险评估模型设计与应用
姓名：	邵健健
学号：	1301221622
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程管理硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	Design and Application of Risk Assessment Model for Public Security Frontier Information Security Management of Province A
关键词：	信息安全多维动态管理模型风险评估公安边防
外文关键词：	Information Security Multi-dimensional Dynamic Management Model Risk Assessment Public Security Frontier
论文摘要：	︿进入新世纪以来，网络技术逐渐运用于公安的边防建设，从而引起了军事信息安全领域内的激烈竞争。在此背景下，我国公安边防加强信息风险评估系统的建设，并且取得了很大的进步。但是，当前国内外形势相当复杂，国外的恐怖主义试图通过多种手段窃取我国的军事国防机密，这对我国公安边防的信息安全带来了严重的威胁。所以，在未来的公安边防信息风险评估的系统建设中，我国还需要在技术、管理、观念等层面进行改进和完善。因此，本文对公安边防信息安全管理的风险评估进行研究有了现实的意义。本文通过查询和整理国内外相关文献资料，对公安边防的信息风险评估的相关基本概念进行阐述。然后对现有的风险评估方法的理论、标准、应用情况进行分析。最后以A省公安边防部队中的信息安全风险评估管理问题为例，使用多维动态管理模型对公安边防信息风险评估的现状进行分析，从分析中发现风险评估系统存在漏洞、缺乏专业的技术人员、管理不善等问题。所以，根据上述的问题本文主要从风险的识别、量化、控制等方面提出一些针对性的解决方案。本文的创新点在于通过多维动态管理模型对A省公安边防信息风险评估进行分析，包括对风险评估模型架构的分析、多源信息层分析、基础数据层分析、信息安全风险控制层分析等。结合A省公安边防风险评估的实际工作情况，运用风险识别、量化方法对信息风险进行评估，在一定程度上满足了公安边防对信息安全管理方面的准确性评估需求。它不同于以往的分析方法，通过这种多维动态模型分析，数据得到最大的分析和组织，利于控制信息的风险，及时发现风险和解决问题。﹀
外文摘要：	︿ Since the new century began，network technologies have been applied into public security frontier, causing the fierce competition in the field of military information security. In this context, China's public security frontier has strengthened the construction of information risk assessment system, which has made great progress. However, the current domestic and international situation are complex. Terrorism abroad attempts to steal our military defense secrets by various means, which poses a serious threat to the information security of China's public security frontier.Therefore, when constructing the future information risk assessment system, we are required to improve the technology, management, concept and so on. This is therealistic significance to study the risk assessment of China’s public security frontier information security management in this paper. This paper expounded the basic concepts of information risk assessment of public security frontier through literature review. Then, the theory, standard and application of the existing risk assessment methods were analyzed.Finally, with the information security risk assessment of public security frontier in A province as an example, this paper analyzed the current information risk assessment of the public security frontier based on the multi-dimensional dynamic mode.The findings include the risk assessment system loopholes, the lack of professional technical staff, the mismanagement and other problems. According to the above problems, this paper puts forward the solutions from the aspects of risk identification, risk quantification and risk control. The innovation of this paper is to analyze the information risk assessment of public security frontier in A Province through the multi-dimensional dynamic management model, including the analysis of risk assessment model architecture, multi-source information layer, basic data layer, information security risk control layer. Combined with the actual work of the information security risk assessment in A’s public security frontier, adopting the methods of risk identification and quantification meets the needs of accurate information security risk assessment for public security frontier. The analysis methods are different from the previous. Through the multi-dimensional dynamic model analysis, the data got the maximum analysis and organization, conducive to the control of information risk, timely detection of risks and problem solving. ﹀
分类号：	TN915.08
论文总页数：	59
参考文献总数：	50
馆藏号：	017/M2017(741)
公开日期：	2017-05-21

垂直领域专家观点可信度关键技术研究与实现.王晴旭

链接

题名：	垂直领域专家观点可信度关键技术研究与实现
姓名：	王晴旭
学号：	1401210750
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	中国科学技术信息研究所
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2017-05-21
关键词：	垂直领域专家观点观点发现可信度评分
外文关键词：	Vertical Domain Expert Opinion Opinion Discovery Reliability Scoring
论文摘要：	︿随着网络媒体的迅速发展，网络舆论的影响已深刻地渗入到社会政治、经济、文化、生活等各个方面。如何正确地引导舆论，是互联网媒体面临的一个重要挑战。名人舆论作为网络舆论的一种，通过为网民提供信息、观点或建议，施加其个人影响，引导舆论走向。目前已有的一些舆情监控系统主要针对公共舆情，对舆论中的名人舆论尚未有针对性的引导和监控。另外，目前的舆情监控系统多面向通用领域，资源使用效率不高。根据上述情况，本文分析前人研究成果，总结优缺点，主要针对网络上垂直领域的专家言论，研究了知识发现、观点挖掘、可信度评估等方面的技术，并运用这些技术设计实现了垂直领域专家可信度系统。本文首先对垂直领域语料的知识发现技术进行研究，主要包括垂直领域语料的主题发现和观点持有者发现。主题发现主要根据垂直领域的主题特征，对比了改进的TF-IDF特征选择和LDA主题特征选择方法，并用SVM进行主题分类。同时针对垂直领域言论语料中观点持有者的特点，提出了基于NER和RF的观点持有者发现方法。经实验证明，这些方法能够有效地对垂直领域语料进行主题和观点持有者发现。对专家言论的观点发现进行研究。分别提出了基于依存句法的显性观点发现方法和基于KNN的隐性观点发现方法。经实验发现基于依存句法的方法精确率高，召回率不足，而基于KNN的方法召回率较好，因此采用二者相结合的方法对专家言论进行观点发现。对专家观点可信度评分技术进行研究，主要包括多元观点评分信息抽取和专家观点可信度评分。多元观点评分信息的抽取主要包括对象、时间、数值、趋势等信息元素，通过条件随机场模型实现。对CRF抽取后的观点评分信息，我们提出了基于垂直领域的分类和计算方法，并参考IMDB评分算法，提出基于观点发表数量和准确率的专家观点可信度评分标准。本文在上述关键技术的研究基础上进行了领域适用性研究。对技术进行模块化处理，并在各技术模块的实现中加入对不同垂直领域的适用性设计。基于技术模块的领域适用性研究，进一步设计了多领域适用的专家观点可信度系统架构。最后，本文实现了垂直领域专家观点可信度系统，并将系统应用于经济领域，对经济领域的专家言论进行挖掘，实现专家主题观点可信度评分、专家综合观点可信度评分和主题下专家观点可信度排序等功能，对研究的垂直领域专家观点可信度关键技术进行了有效验证。﹀
外文摘要：	︿ With the rapid development of network media, the influence of the network public opinion has deeply penetrated into the social politics, economy, culture, life and so on. How to correctly guide public opinion is an important challenge for the Internet media. As a kind of network public opinion, the public opinion of the celebrities exert their personal influence and guide the public opinion by providing users with information, opinions or suggestions. Some of the current public opinion monitoring system has been mainly for public opinion, public opinion of the celebrity has yet to guide and monitor. In addition, the current public opinion monitoring system faces the general area, the resource utilization is low. According to the above situation, this paper analyzes the results of previous studies, summarizes their advantages and disadvantages, research the technology of knowledge discovery, opinion mining, reliability assessment for expert speech corpus in vertical domain. Using these techniques, we design and implement the vertical domain expert credibility system. In this paper, we first study the knowledge discovery technology of vertical domain corpus, including the topic discovery of the vertical domain corpus and the holder discovery. According to the feature of the vertical domain, we compare the improved TF-IDF feature selection and the LDA feature selection method, and then use SVM to do the topic classification. In view of the characteristics of the opinion holder in the vertical domain speech corpus, we propose a method for finding the opinion holder based on NER and RF. The experiments show that these methods can effectively find the topic and the opinion holder of the vertical domain corpus. For expert opinion, this paper proposes a method based on dependency syntax for explicit opinion discovery and an implicit opinion discovery method based on KNN. The experimental results show that the method based on dependency syntax has high accuracy and low recall rate, and the KNN based method has a high recall rate. Therefore, we combine two methods to find out the opinion of expert speech. The research on the expert opinion credibility scoring technique includes the multiple view scoring information extraction and the expert opinion credibility scoring. The extraction of multiple opinion scoring information mainly includes the extraction of elements such as opinion object, time, numerical value and trend, realized by CRF model. We present a classification and computation method based on vertical domain for scoring information extracted from CRF. With reference to the IMDB scoring algorithm, we propose an expert opinion reliability scoring criterion based on the number and accuracy of published opinions. Based on the research of the above key technologies, this paper makes a research on the domain adaptation. The technology is modularized and the applicability of different technical fields is added to the implementation of the technical modules. Based on the field applicability of the technology module, the multi domain expert opinion credibility system framework is designed. Finally, this paper develops a vertical domain expert opinion credibility system. The system is applied in the economic field, mining the expert opinion in the economic field. The system achieves the following functions: the expert opinion reliability score of the topic, the expert's comprehensive opinion reliability score, the ranking of the expert's opinion reliability under different topics, which can effectively verify the key technology of vertical domain expert opinion. ﹀
分类号：	H087/TP391
论文总页数：	77
参考文献总数：	36
馆藏号：	017/M2017(574)
公开日期：	2017-05-21

面向科学文献的比较式摘要生成技术研究与实现.杨雨青

链接

题名：	面向科学文献的比较式摘要生成技术研究与实现
姓名：	杨雨青
学号：	1401210816
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	中国科学技术信息研究所
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2017-05-21
外文题名：	Research and Implementation on Comparative Summarization for Scientific Literature
关键词：	比较式摘要多文档摘要文本挖掘知识挖掘
外文关键词：	comparative summarization multi-document summarization text clustering knowledge mining
论文摘要：	︿不少科学文献检索系统基于多文档摘要技术对检索得到的话题文献进行了“二次挖掘”，但大多只针对单个话题的相关信息，而无法提供不同话题之间的关联与区别。因此，本文提出了一种领域科学双话题比较式摘要生成模型，在开放平台知识挖掘和多文档摘要生成技术的基础上，获取基于子主题对齐和属性对齐的双话题比较信息，且基于该模型原理设计并实现了多维的领域话题比较式摘要检索系统。基于子主题对齐的比较式摘要是指挖掘双话题下的共有概念，获取共有子概念的核心文献信息作为摘要比较呈现；基于属性对齐的文献比较式摘要是指在子主题挖掘的基础上，考虑话题与其相关概念间的语义关系，获取双话题共有属性下表达不同概念的核心文献信息作为摘要比较呈现。为了更好地识别话题下子主题属性，本文充分考虑了已有知识资源，首先构建爬虫系统联合爬取领域主题词表、百科、书籍目录、资讯等公开资源，以语块级词共现为基础自动挖掘领域话题属性下的关联概念词，作为比较信息挖掘的先验知识，此外爬虫还爬取了领域期刊文献信息，经过清洗后建立了索引作为语料数据。本文探讨了文献比较式信息获取的核心技术：首先使用规则和分句重合度结合的方法抽取文献原摘要关键信息；接着通过AGENES聚类划分双话题下的文献集合，识别双话题下的共有子主题和独立子主题，特别地，在BOW构建阶段，本文结合领域特征过滤了对主题挖掘贡献度较低的词语，通过文献核心词加权、单话题高频词降权等方式改进了词权重表示方式，并联合了词频向量和隐主题向量作为聚类输入，以提升共有子主题挖掘的能力；然后，结合先验话题语义知识挖掘子主题属性，并将未登录知识补充入话题知识结构，完成话题知识的进化；最后，综合考虑子主题的属性和文献覆盖度对子主题进行排序，综合子主题内候选句重要性及句子来源文献引用量、发表时间、期刊因子等外部信息对句子进行打分，获取主题中心句。最终，通过已经得到的主题对齐和属性对齐的比较信息生成综合性的比较式摘要。本文对各个模块性能进行了实验，证明了本文提出方法的有效性。最后，本文设计并实现了基于J2EE框架的领域科学文献多维信息检索系统，集成了语料及知识爬取模块、单话题文献检索及时间线摘要文献推荐模块、双话题比较信息推荐等三大模块，通过Echarts等插件实现了系统的可视化功能，完成了科学文献多维信息检索系统的基本应用需求。﹀
外文摘要：	︿ A lot of scientific literature retrieval system has carried on second excavation on the retrieved literature with multi-document summarization technology. Nevertheless, most platforms only deal with a single topic without considering the comparison between different topics. Therefore, we propose a comparative summarization generating model for domain scientific topics, which is to generate comparative information with sub topic alignment and property alignment based on web knowledge mining and multi-document summarization technology, and design and implement a multidimensional domain topic comparative summarization retrieval system. To generate comparative summarization with subtopic alignment is to mine the common concept in both topics and obtain core information from the core literature as summarization. To generate comparative summarization with property alignment is to take the semantic relationship among the current topic and subtopics into consideration and obtain core information from core literature with common properties as summarization. We implement a web crawling system, which crawls domain structured thesaurus, open encyclopedia, domain book catalog and other domain information in parallel and mines topic property related knowledge terms automatically as a priori knowledge of comparative information mining. In addition, the system also crawls a large number of journals from web to form the domain literature corpus after data cleaning and indexing. Then we discuss the key technologies of the model. We extract the key information of the original literature. We divide subtopics of two major topics with AGNES clustering and get the common and independent subtopics between the two major topics. Before clustering we filter words which have little contribution to subtopic mining and improve the way of word weights computing upon the need of comparative mining. We also combine the word frequency vector and latent topic vector as the input of AGNES clustering. With priori knowledge we divide the subtopics upon their properties with classification, and supplement the priori knowledge with new subtopics. We rate the subtopics upon literature coverage and prior importance and rate sentences in every topic upon average similarities and external information such as citations, publication time and journal importance factor. Finally, we get comparative information with subtopics alignment and property alignment and generate a comprehensive summarization. We test the performance of each module to prove the validity of proposed methods. Finally, we design and implement a multidimensional comparative summarization retrieval system for domain scientific literature. The system is based on J2EE framework, including web information crawler module, single topic literature, timeline summarization retrieval module and comparative information retrieval module. The visual function of the system is realized by Echarts and other plug-ins. We complete the basic application requirements of the multidimensional information retrieval system for domain scientific literature. ﹀
分类号：	TP274/TP393.4
论文总页数：	61
参考文献总数：	40
馆藏号：	017/M2017(698)
公开日期：	2017-05-21

2017-05-19

语音技术传播可用性研究——以热线客服为例.高岚

链接

题名：	语音技术传播可用性研究——以热线客服为例
姓名：	高岚
学号：	1301220988
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工学硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2017-05-19
外文题名：	Research on The Usability of Verbal Technical Communication: Taking Hotline Customer Service as An Example
关键词：	语音技术传播热线客服可用性量表编制
外文关键词：	Verbal technical communication Hotline customer service Usability Scale development
论文摘要：	︿语音技术传播是一种利用语音作为唯一沟通载体的技术传播形式，因此具有技术传播和语音传播的双重特性。目前，语音技术传播已在人们的生活当中得到了广泛的应用，很多企业都为顾客提供热线客服服务，通过语音传播的方式为客户提供技术支持。但在实践当中，语音技术传播仍存在很多问题，客户的满意度依然有待提高，语音技术传播的可用性也缺乏科学的评价体系。而针对语音技术传播的学术研究也相对匮乏，不足为解决问题提供充分理论支持。故本文以热线客服为例，对语音技术传播的可用性进行研究，采用模型的方式认识并归纳语音技术传播的内涵及特点，并在此基础之上对科学评价语音技术传播可用性的方法进行研究。本文采用了规范研究与实证研究相结合的研究方式，在前人研究的基础上总结语音技术传播的特点，对语音技术传播的本质进行抽象提炼，通过规范研究的方法首先提出具有构造和解释功能的语音技术传播结构模型，在此基础上进一步提出具有启发和预测功能的语音技术传播可用性模型。该模型将语音技术传播抽象为语音信息、表达形式和传播噪声三大要素，并将其嵌入三维坐标当中，通过不同卦限来表示不同的可用性。最后，在模型的基础上提出语音技术传播可用性量表，用来评价语音技术传播的可用性。本文采用实证研究的方法，模拟客户向客服求助技术问题的真实情景，将60 名随机被试者被分成两组，分别验证模型和评估量表的质量。研究表明：在本文提出的语音技术传播可用性模型中，基本假设成立，模型可以通过有效性检验。故语音技术传播可以被分解为语音信息、表达形式和传播噪声三部分，且语音信息和表达形式与语音技术传播的可用性呈线性正相关，传播噪声与语音技术传播的可用性呈线性负相关。当三要素协同作用时，传播噪声所产生影响可以通过优化语音信息和表达形式加以削减。在此结论上提出的语音技术传播可用性量表从语音技术传播的特点出发，问题涵盖了语音信息和表达形式的各个方面，且在质量评估中表现良好，故可以作为评价热线客服语音技术传播可用性的依据。未来的相关研究可以在真实的应用场景中采用更大样本对语音技术传播进行实证研究，从而探索出更加具体的指导方针来提升语音技术传播的可用性。﹀
分类号：	G206.3
论文总页数：	92
参考文献总数：	63
馆藏号：	017/M2017(775)
公开日期：	2017-05-19

在线合作批注翻译教学研究.肖龙

链接

题名：	在线合作批注翻译教学研究
姓名：	肖龙
学号：	1401210783
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
关键词：	在线合作批注翻译教学合作式学习
论文摘要：	︿目前，在线合作批注教学研究多集中在阅读领域，应用到翻译领域的研究很少。在线合作批注阅读教学的研究结果表明，在线合作批注工具对于改善学生的阅读状况，提高成绩具有积极作用。根据笔者对当前翻译教学的实际调查，目前我国翻译教学普遍翻译信息反馈与教学脱节、学生参与度较低、翻译重结果轻过程等问题。鉴于在线合作批注对阅读教学的积极作用，笔者考虑将在线合作批注工具引入到翻译教学当中，对在线合作批注翻译教学进行深入研究。本文以斯坦福大学研发的Lacuna在线合作批注教学平台作为实验依托，以对比实验的方式重点研究了在线合作批注工具对翻译教学的影响。文章首先介绍了批注的种类和在线合作批注的理论基础，整理了当前在线合作批注阅读教学的现状，并对本次实验用到的重要技术工具Lacuna进行介绍。根据对翻译教学的调查，笔者梳理了学生与老师的翻译教学需求，并以此为依据详细阐述了本次实验的设计和实施流程。通过对不同阶段翻译教学实验数据的收集，从翻译信息反馈、学生参与、翻译能力、翻译成绩和学生态度的角度分析在线合作批注对教学效果的影响。研究表明，在线合作批注翻译教学不仅能够帮助教师及时掌握学生翻译的整体情况，提升学生在翻译教学各个环节的参与程度。而且对于加强学生翻译讨论的集中性，提升学生的翻译评价能力具有很大帮助。可以促进学生更好地理解英汉两种语言在词汇表达、句子结构、文化内涵的差异，通过批注讨论的方式实现相互学习与借鉴。在文章最后，笔者整理了本次实验的结论，指出了本次研究的主要贡献及不足，同时对未来研究进行展望。﹀
分类号：	H087/TP391
论文总页数：	83
参考文献总数：	50
馆藏号：	017/M2017(301)
公开日期：	2017-05-19

从认知负荷的角度探究弹幕对在线学习者的影响.路康虹

链接

题名：	从认知负荷的角度探究弹幕对在线学习者的影响
姓名：	路康虹
学号：	1301210832
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
关键词：	认知负荷弹幕娱乐化学习
论文摘要：	︿随着科技的发展，越来越多的人选择使用在线学习的方式进行自主学习。然而，在线学习提供更高的个人学习自由度的同时，也带来了新的问题，如群体缺失带来的孤独感，以及学习者在线交流活跃度低等问题。弹幕在提供临场感和陪伴感、刺激用户交流方面有天然的优势，因而弹幕在教育领域中的应用成为了弹幕研究的热门方向之一。以往对弹幕的研究，往往基于弹幕自身的优势，探究在学习环境下，弹幕在促进交流、构建学习共同体方面的效果。然而随着在线交流参与度的提升，必然带来发言数量的增多。弹幕直接将评论内容展示在视频中，学习者需要在观看视频学习时，同时处理弹幕中的信息。这样的信息呈现方式对学习者是否有负面影响，有待探索。本研究从已有弹幕应用的娱乐化学习领域入手，从认知负荷的角度切入，通过实验探究了弹幕对在线学习者的影响，分析了引入弹幕后，学习者认知负荷、情绪反馈和学习效果的变化，总结了主要影响因素，并探索了控制负面因素的方向。本文详述了研究中的理论分析、实验过程以及结论。实验结果显示，弹幕不加控制地展示时，显著增加了学习者的外部认知负荷，诱发负面情绪，对正面情绪和学习效果的影响并不显著。弹幕的数量是影响外部认知负荷的重要因素，弹幕量越多，外部认知负荷越高，负性情绪也越强烈。弹幕数量相对合理时，弹幕内容与学习视频内容相关性显著影响关联认知负荷，突出展示的高相关性内容，能提升关联认知负荷，迁移测试成绩也相应较高；但弹幕中质疑视频内容的部分给学习者带来困惑，明显强化烦躁情绪。总而言之，弹幕的引入会增加学习者的认知负荷，诱发负面情绪，但可以通过控制弹幕数量、筛选并突出展示与学习视频内容相关度高的弹幕的方式，弱化弹幕的负面影响。在实际应用中，可以参考者两个因素，提供相应的控制功能，以进一步优化弹幕作为在线学习中师生、生生交流方式和用户评论呈现方式的实用效果。﹀
分类号：	TP3
论文总页数：	81
参考文献总数：	0
馆藏号：	017/M2017(489)
公开日期：	2017-05-19

以问题为导向的翻转课堂自学效果的应用研究.龙翔

链接

题名：	以问题为导向的翻转课堂自学效果的应用研究
姓名：	龙翔
学号：	1301210823
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	钱多秀
论文答辩日期：	2017-05-19
外文题名：	Application Research on Effect of Self-learning in Problem-guided Flipped Classroom
关键词：	翻转课堂以问题为导向的学习过程性评价社会网络分析法
外文关键词：	Flipped Classroom Problem-based learning(PBL) Process Assessment Social Network Analysis
论文摘要：	︿息技术为教育带来了新一轮改革，其中翻转课堂（Flipped Classroom）正在颠覆传统的教学模式。在翻转课堂环境中，学生可以摆脱课堂中机械、传统的教学方式，真正掌控自己的学习，不仅可以在课前利用互联网技术对课程内容进行自学，同时可以在课后与学习同伴进行深度的学习讨论，并强化自己所学的知识。教师也由传统的知识的“传授者”，向学生学习过程的“参与者”转变，并对每个学生展开有针对性的个性化指导。然而，笔者作为北京大学《计算机辅助翻译原理与实践》课程的助教，经过一年的学习和总结发现，目前的翻转课堂依然存在着学生对于学习知识的“迷航”，对于课程参与度不高，以及由于学习效果跟踪手段欠缺造成评价体系不完善的问题。那么，如何调动学生参与学习和讨论的积极性，如何全面了解和评估学习者的学习情况、如何有效的评价学习效果，成为在翻转课堂环境中开展教学需要解决的问题。针对上述问题，笔者对目前翻转课堂所存在的问题以及讨论题目本身所存在的现实需要进行阐述，结合课程的整体要求，阐明了以问题为导向的翻转课堂模式研究的必要性，并明确了研究的思路和研究方法，提出了以问题为导向的翻转课堂的教学模式，其中包括：课前对于课程内容、讨论题、以及学习重点的发布；课程中对于学习者之间互动讨论的技术支持以及课后对于学生个体学习效果的动态跟踪评价。本研究于2016年9月至2017年1月在北京大学语言信息工程系16级共计39名研究生一年级学生进行为期一个月的调查研究与实验，并利用社会网络分析法（SNA）对学生在翻转课堂中的讨论学习记录进行分析，运用UCINET软件对其进行分析总结，通过定性和定量分析对实验效果进行阐述和总结，通过实验组和对照组的两组数据进行对比分析以问题为导向的翻转课堂在实际开展和推行中的实际效果。通过实验及调查问卷分析，笔者发现，以问题为导向的翻转课堂教学模式在实际教学中确实能够帮助学生提高学习主动性，帮助学生解决课程重点的把握的问题；使学生积极参与讨论学习，并为教师客观动态地评价学生的学习效果提供依据。实验结果证明无论是教学效果还是学生学习满意度实验组学生均高于对照组学生。笔者希望通过本文研究为翻转课堂的教学模式提供新的思路和解决办法，也为以后大规模集体在线学习提供了新的教学模式。﹀
外文摘要：	︿ Information technology has brought a new round of education reform, flipped classroom has converted the traditional mode of teaching. Students extricated themselves from traditional teaching method and became the real master of their own learning in the flipped classroom. They can learn by themselves with Internet technology, and also they can discuss with their classmates in depth after class. Instead of traditional imparting knowledge, teachers have more chances to take part in students’ self-learning and give guidance according to everyone’s situation. However, students still get lost when learning in current mode of flipped classroom, furthermore, they do not have motivation in participating. This not only influences students’ enthusiasm, but also intervenes teacher’s master of students’ learning outcome. So, how to arouse the initiative of students and how to evaluate students’ learning outcome comprehensively and thoroughly becomes the problem needed to be solved in flipped classroom. In order to solve the problems discussed above, the author comes up with the new mode of problem-guided flipped classroom, which includes announcement for key points and difficult parts, as well as discussion topics before class; technical support between students in class; also real-time track and evaluation for students’ learning outcome after class. After one month’s investigation and experiment in 39 fresh graduate students of Language Information Engineering Department of Peking University from Sep. 2016 to Jan. 2017, the author recorded and analyzed discussion with software Ucinet under the guidance of Social Network Analysis, and summarized the effect of experiment by making quantitative analysis and qualitative analysis. By comparing data of two different groups (experimental group and control group), the author researched and analyzed the real effect of this new mode of problem-guided flipped classroom. Through experiment and questionnaire analysis, the author found that the mode of problem-guided flipped classroom can really help students improve their learning initiative in practical learning and enable the students take participate in discussion more actively. No matter the actual learning effect or the students’ learning satisfaction of experimental group’s students are higher than control group. The author hopes to provide a new way of thinking and solutions to the teaching mode of flipped classroom, and also provides a new teaching model for large-scale collective online learning in the future. ﹀
分类号：	H087/TP391
论文总页数：	93
参考文献总数：	52
馆藏号：	017/M2017(524)
公开日期：	2017-05-19

基于规则方法的对外汉语“语序错误”检测研究.张璐瑶

链接

题名：	基于规则方法的对外汉语“语序错误”检测研究
姓名：	张璐瑶
学号：	1401210846
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
关键词：	对外汉语中介语规则挖掘语序错误位置识别语序错误修改
论文摘要：	︿语序错误是对外汉语最常见的一种偏误。已有的研究表明，为外语学习者提供有效的校正反馈，有利于其语法能力的发展。因此针对于语序错误的偏误特点，来开发一款可自动识别和改正对外汉语语序错误的检测工具，能够帮助学生提高其语法能力，同时可以帮助教师减少工作负担。但是由于汉语的语法灵活，语序错误表现错综复杂，因此，自动识别和改正语序错误的困难很大。目前，针对对外汉语语序错误的研究非常少，已有的方法处理这个问题的效果都不理想。本研究选择使用基于规则的方法。但是传统基于规则的方法都也依靠语言学家人工总结规则，自动化程度极低，耗费大量人力。因此，本研究探索使用基于n-gram的方法来自动挖掘规则，以代替人工总结规则的方式。还将自动挖掘的规则进行分类，探究这些规则背后的语法知识。在研究的最后，使用所有挖掘到的规则建立成规则模块，并将所对应的语法知识建立成语法知识模块，开发出一款自动检测工具。这款工具能自动识别语序错误的位置并修改语序错误，向用户推荐相应语法知识，还支持用户批量导入标记了语序错误位置的汉语中介语语料，自动大量生成规则，并且允许用户用正则表达式添加规则，来拓展规则库。详细地说，本研究自动挖掘了词序列规则和词性序列规则。两种规则的挖掘都使用n-gram技术做支撑。在挖掘词性序列规则之前，先进行了前测研究，证明对于词汇简单的对外汉语写作，自动挖掘的词性序列规则是可用的。本研究的实验采用了10-折交叉验证实验。实验结果如下：在10次独立的实验中，每个实验平均从3015条语料中挖掘出了2243条词序列规则与669条词性序列规则。词序列规则在正确语料的平均误报率是0.93%，在识别语序错误的位置时，平均正确率是62.00%，平均F值是15.49%，如果语序错误位置确定，平均有79.41%的语序错误被正确修改；词性序列规则在正确语料的平均误报率是0.66%，在识别语序错误的位置时，平均的正确率达到63.18%，平均F值是10.24%，如果语序错误位置确定，平均有78.73%的语序错误被正确修改；将词序列规则和词性序列规则结合后形成规则库，规则库在正确语料的平均误报率是0.93%，在识别语序错误的位置时，平均的正确率是60.07%，平均F值是19.89%，如果语序错误位置确定，平均有79.37%的语序错误被正确修改。本方法的意义在于使用基于统计的方式从词和词性的层面上自动挖掘规则，完全实现了规则的自动化获取，免除了总结规则时的人工投入。对于像语序错误这种表现形式十分复杂的语法偏误，用基于统计的自动挖掘规则的方法，全自动获得规则，之后可在自动获取的规则的基础上，将规则进行一定程度的泛化，提高获取规则的效率。同时，自动挖掘的规则能够概括偏误的常见类型，能反映出汉语中介语的语言学规律，可作为对外汉语教学与研究的参考。本研究最终开发的自动检测工具，可帮助对外汉语学生检测其写作中的语序错误，也可辅助对外汉语教师批改学生写作中的语序错误，以减少其工作量。﹀
分类号：	H087/TP391
论文总页数：	78
参考文献总数：	0
馆藏号：	017/M2017(555)
公开日期：	2017-05-19

基于数据驱动和形成性评估的高中英语词汇教学研究.程鑫

链接

题名：	基于数据驱动和形成性评估的高中英语词汇教学研究
姓名：	程鑫
学号：	1401210528
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
外文题名：	Research on High School English Vocabulary Teaching Based on Data-Driven Method and Formative Evaluation
关键词：	二语词汇习得数据驱动形成性评估
外文关键词：	L2 vocabulary acquisition Data-driven Formative evaluation
论文摘要：	︿在过去的几十年里我国的课堂词汇教学取得了显著成就，但依然有较大的改进空间。教学成果的评估多为终结性的，忽略了学生的主体地位；评估数据缺乏量化分析和有效反馈；评估结果未能很好地作用于教学活动；没有很好地关注到学生的个体差异性；词汇教学安排没能充分发挥语料库的价值。本研究以高中英语词汇教学为切入点，基于形成性评估和第二语言词汇习得的研究成果，结合语料库及相关研究工具，提出了基于数据驱动和形成性评估的词汇教学方法。该方法关注学生的个体差异和主体地位，注重评估的分析和反馈，并且充分发挥语料库的作用。本研究提出的方法主要包括以下四个部分：学习资料、形成性评估、启发式规则系统、复习部分。学习资料主要是通过对英国国家语料库、英语教材等的分析进行词汇筛选、排序以及学习材料的选择。形成性评估包括课堂练习、多维评估、评估记录、学生语料和学习档案袋。启发式规则系统通过对当前学习情况和学生水平的分析，提出复习建议，绘制学习情况图，帮助学生直观了解自身学习情况。复习部分主要是根据复习规则和学生个体情况，为学生推荐个性化的复习材料。为了对方法的有效性进行验证，本研究在安徽省南陵县某高级中学开展了为期3周的教学实验，以40名高一年级学生为实验对象。实验组采用本研究提出的基于数据驱动和形成性评估的词汇教学方法，基于多因素安排词汇学习顺序，所有被试的初始学习材料相同，在实验过程中对个体学生的学习情况进行记录、反馈和分析，按需提供个性化的复习资料，并进行跟踪评估和调整；对照组采用终结性非数据驱动的教学方法，词汇学习顺序依教程词表顺序而定，所有被试的学习、复习材料完全相同，反馈内容以分数为主。研究结果表明：本研究提出的方法确实可以有效促进高中英语词汇的习得效果和保持效果，尤其是对于英语水平中等及以下的同学；同时也为学习者所欢迎。﹀
外文摘要：	︿ Although classroom vocabulary teaching has shown significant improvement in the past decades, limitations still exist. In current stage, summative evaluation is widely adopted, in which the thorough analysis of study data, effective feedback and needs of students with different proficiency levels have not received enough attention. Besides, corpus does not play its full role in vocabulary teaching well. Focusing on senior high school vocabulary teaching and learning activity, this study adopts the formative evaluation and data-driven method in the process, by using data analysis, corpus and related tools. Theoretical and experimental fruits of formative evaluation and second language acquisition also provide guidance for this study. The vocabulary teaching method put forward here consists of four parts: study material, formative evaluation, heuristic rule system and review. Study material is selected from British National Corpus, senior high school textbooks and other materials. Formative evaluation is made up of exercise, evaluation, record and electronic study portfolio. Heuristic rule system is designed for providing review suggestions and feedback based on the evaluation and student’s general English proficiency. As for review, it is mainly responsible for providing proper and personalized review materials and practice, based on certain rules and student’s condition. In order to testify the availability and effectiveness of this method, a three-week-long teaching experiment is conducted in one senior high school of Nanling County, Anhui Province, with forty students as the experiment targets. The experiment group consisting of twenty students, are taught according to the teaching method promoted in this paper. In the first period, they get the same study materials, but within the process, their review material differs from each other’s, due to their study condition and English proficiency. In the process they need to review not only their own, but also classmate’s work, and then each student can get their own visualized feedback in the form of graph or charts or both. Besides, their words learning sequence is arranged according to several factors. While students in comparative group receive the same study and review material no matter how different their levels and study conditions are. The feedback they got are mainly scores. The word sequence is similar to the order in the textbook vocabulary list. The results show that the vocabulary teaching method based on data-driven method and formative evaluation brings better vocabulary learning and keeping result, especially for students with comparatively low or intermediate English proficiency and is accepted by the majority of students. ﹀
分类号：	H030
论文总页数：	118
参考文献总数：	111
馆藏号：	017/M2017(561)
公开日期：	2017-05-19

基于wiki的小组协作式翻译教学研究.代碧薇

链接

题名：	基于wiki的小组协作式翻译教学研究
姓名：	代碧薇
学号：	1401210530
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
外文题名：	Research on Wiki-based Group Collaborative Translation Teaching
关键词：	wiki 小组协作翻译教学
外文关键词：	Wiki Group Collaboration Translation Teaching
论文摘要：	︿当前，wiki在外语学习的研究主要集中在协作性的写作教学中，将其应用到协作性翻译教学的研究极少。但随着翻译行业的发展，翻译已经成为了一个极具协作性的活动，翻译教学也越来越注重学生协作能力的培养。因此，利用wiki协作性的优势来进行翻译教学是翻译人才培养的趋势。本文将wiki引入小组协作模式下的翻译教学，利用wiki功能来提高小组协作翻译的教学效果。首先，本文针对当前小组翻译教学问题及需求，进行了相应地教学设计和评估设计。其次，利用实证研究的方法对中南大学英语专业三年级40名学生进行了近两个月的翻译教学实验。实验前，学生分为实验组和对照组，通过前测验证两组翻译水平并无统计学意义上的差异后，再各自划分为6个小组分别采用不同的方式进行小组翻译；翻译后通过问卷调查进行定性分析，以调查学生对于此次翻译教学的态度。最后，本文利用wiki平台对实验组学生的翻译数据进行不同维度的统计、分析，并得出了结论。本文主要研究的问题包括wiki在小组翻译教学中的有效性，以及wiki支持下的小组翻译教学与传统小组翻译教学产生的差异如何。实验结果表明，wiki功能对翻译教学具有积极的作用。对于教师来说，wiki将翻译过程“透明化”，便于教师在翻译过程中对学生的翻译进行考察和监督。而对学生而言，wiki提高了学生参与小组翻译的自觉性和协作性，学生能够自觉地分配、完成翻译任务，能够积极地与小组成员进行翻译疑难点的讨论，并根据有效反馈对译文错误进行修改，从而产出质量更高的译文终稿。研究还发现，教师应该对学生进行适度的监督及引导，提高学生对翻译各环节的重视程度、督促学生合理安排翻译时间，以更好地发挥wiki及小组协作的优势。总之，本研究认为利用wiki进行小组翻译教学，不仅能够促进学生语言知识、翻译技巧的学习，还能提高他们的审校能力、协作互动能力、翻译评价能力、问题解决能力以及信息技术能力。﹀
分类号：	H087/TP391
论文总页数：	78
参考文献总数：	58
馆藏号：	017/M2017(627)
公开日期：	2017-05-19

中国网络玄幻小说海外译介研究.邓平博

链接

题名：	中国网络玄幻小说海外译介研究
姓名：	邓平博
学号：	1301210596
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
论文答辩日期：	2017-05-19
外文题名：	Studies on the Translation and Communication of Chinese
关键词：	网络小说玄幻小说译介学
外文关键词：	Web Novel Xuanhuan Novel Translation Stuedies
论文摘要：	︿中国网络玄幻小说自 2015 年底开始在海外流行，有一批通晓双语的中文玄幻小说爱好者自发地在网上翻译、传播这类小说，并迅速赢得了一大批海外读者的青睐。亚洲网络翻译小说聚合搜索网站NovelUpdates 上排名前 10 的小说，有 9 部来自中国。本文的研究目的，是探究这一文化现象的参与主体、内容、发展经过和背后的原因。在译介学的理论框架下，重点研究中国网络玄幻小说海外译介的主体、译介内容和译介受众三个要素。具体而言，回答了谁在翻译、为什么翻译、翻了什么、谁在读、为什么读、未来发展会怎样等问题。研究方法上，本文访谈海外翻译组成员 15 人、问卷调查 103 人；访谈海外读者 20 人，问卷调查 330 人；分析统计读者讨论、书评 400 余条（篇）。本文研究发现，中国网络玄幻小说的海外译介主体不是个别爱好者个体，而是有组织、成规模的“翻译组”。译者们的工作动机主要基于对这类小说的兴趣。中国网络玄幻小说的海外译介内容受译者偏好、原作在国内受欢迎程度以及海外读者偏好三个因素共同影响。海外读者作为译介受众，多为高中及大学在校男生，大多数从漫画和日本轻小说读者转化而来，有网络文学的阅读经历。海外读者喜爱中国网络玄幻小说的要素有：小说主人公离经叛道、新颖的世界观设定和异域感、情节推进有力、趣味性强，以及国内外读者共同的阅读喜好。海外读者不喜爱的要素有：内容想象力匮乏、性侵和歧视性内容以及过多、单一的中国文化元素。基于以上研究发现，本文还对中国网络玄幻小说在海外的发展做出了展望，提出了玄幻小说在海外面临的三大挑战：读者群体不够大、内容创新后劲不足、以及韩国小说优秀内容的有力挑战。本研究的创新点在于对这一新出现的文化现象进行了全方位的调查研究，并且探究了现象背后的原因。本研究的内容和发现对未来中国网络文学在海外的发展，以及中国文学外宣和海外译介，有一定的启示性作用。﹀
外文摘要：	︿ Chinese Xuanhuan web novels have had a good vogue abroad through the spread of its foreign lovers. These lovers translate and spread Xuanhuan novels spontaneously and win a large number of readers’ hearts from 2015. Among the top ten novels on the translated Asian novel navigation website NovelUpdates, nine are from China. The present thesis is conducted around this translation phenomenon. Under the theoretical framework of translation studies theories, this thesis focuses on the subject, content and audience of the translation and communication of Chinese Xuanhuan web novels abroad. By making questionnaires and interviews on the translators and readers abroad, by searching and analyzingon the production and literature, the thesis presents this translation phenomenon from all aspects and explores the underlying reason behind it. The research finds out that: Instead of individual lovers, the translation subject of Chinese Xuanhuan novels is the “translation group” which is organized into scale. The translators are mainly motivated by their interest in Chinese Xuanhuan web novels. The Chinese Xuanhuan novel translation content is decided bythree factors, which are the translators’preference, the popularitydegree of original work in China and the preference of readers abroad. Most of the readers abroad are male student in high school or university. They had ever read web novels and most transferred from comics and Japanese light novels. The elements which win readers’favor are as follows: a) the hero rebels against orthodoxy; b) the world view presented by the novel are fresh and exotic; c) the plot development is fast and interesting; d) there is the mutual reading preference among the domestic and the abroad. The unfavorable elements are: a) the plot is lacking in imagination; b) there is too much plot of sexual assault and discrimination; c) there are too many Chinese elements in the book. Based on above finding, the thesis makes a prospect to the development of Chinese Xuanhuan web novels abroad. It faces three challenges: a) the group of readers is not big enough; b) the innovation of content is short of vitality; c) the excellent content of Korean novel has posed a threat. The present thesis makes a beneficial enlightenment to the abroad development of Chinese web novels and the publicity of Chinese literature to foreign countries. ﹀
分类号：	H059
论文总页数：	78
参考文献总数：	38
馆藏号：	017/M2017(628)
公开日期：	2017-05-19

美国嘻哈乐歌词翻译研究.周天亮

链接

题名：	美国嘻哈乐歌词翻译研究
姓名：	周天亮
学号：	1401210895
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2017-05-19
关键词：	嘻哈乐歌词翻译语料库
论文摘要：	︿嘻哈乐（Hip-hop music）起源于上个世纪七十年代的美国纽约布朗克斯的黑人街区。经过四十多年的发展，在经济全球化、音乐商业化的推动下，嘻哈乐已经不只属于贫民窟里的非裔美国人，它已经跨越种族、国度，在全世界流行。但是国内对嘻哈乐歌词翻译的研究较少，现有的嘻哈乐歌词译文常存在错译或晦涩难懂等问题。笔者希望结合自身多年欣赏嘻哈乐的经验，综合运用定量和定性的分析方法，分析出嘻哈乐歌词的特点，寻求嘻哈乐歌词更好的翻译方法。前人关于歌曲有没有必要译的争论一直存在。笔者把音乐软件作为歌词译文阅读载体，以歌词翻译而不是歌曲翻译作为出发点，从三个方面论述了嘻哈乐翻译的必要性：首先，音乐软件为了提供更好的服务，为外语歌曲提供歌词翻译已经是一种常态，而全球流行的嘻哈乐自然是音乐软件重点翻译的对象，并且绝大多数用户无法用英文直接读懂嘻哈乐歌词，用户也需要翻译；其次，嘻哈乐重歌词轻旋律的特点，更加凸显了翻译歌词的意义；最后，嘻哈乐歌词中具有重要的社会意义与文化价值，既是美国社会与文化的一面镜子，也可以是一种有用的工具。本研究通过语料库工具对嘻哈乐歌词进行了分析，发现其用词丰富度、难度比非嘻哈乐歌词更大，粗口充斥在歌词中，是译者无法回避的特征。此外，嘻哈乐歌词特殊另类的拼写方式体现了其语言的非正式性，歌词中还存在许多数字，这些数字有着特殊含义。本研究还通过具体的翻译实践和对现有译文的分析，发现嘻哈歌词具有“欺骗性”，表面上简单的词汇，在嘻哈乐中却有着不一样的含义，译者容易受到“欺骗”进而错译。笔者还发现嘻哈乐歌手背景信息非常重要，是理解嘻哈乐歌词的基础，并且嘻哈乐存在主歌和副歌风格截然不同的情况。针对以上分析出的特点，笔者提出了操作性较强的六点翻译技巧：（1）粗口翻译的取舍；（2）通俗化、流行化语言的应用；（3）数字指代意义的翻译；（4）俚语词典的优先使用；（5）歌手背景信息的挖掘；（6）听与译的结合。笔者由此认为，嘻哈乐歌词的翻译工作是有需求、有价值、有意义的。嘻哈乐歌词有着它与众不同的地方，译者需端正态度，转变翻译思路，应针对嘻哈乐歌词的特点进行翻译。﹀
分类号：	H087/TP391
论文总页数：	70
参考文献总数：	48
馆藏号：	017/M2017(636)
公开日期：	2017-05-19

政治词汇的汉译策略——以《分裂社会的城与魂》一书的翻译为例.张咪

链接

题名：	政治词汇的汉译策略——以《分裂社会的城与魂》一书的翻译为例
姓名：	张咪
学号：	1301211085
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2017-05-19
外文题名：	The Translation Strategies for Political Words - A Case Study of the Translation of City and Soul in Divided Societies
关键词：	政治词汇汉译翻译策略
外文关键词：	politics political words translate into Chinese Translation Strategies
论文摘要：	︿当今世界政治变化无穷，重大事件不断发生，政治词汇的创新层出不穷。“Post-truth”（后真相的）、“Alt-right”（另类右翼）以及“Brexiteer”（脱欧者）这三个政治新词就被《牛津词典》列为2016年的热门词汇。此外，政治词汇涉及范围广泛，要厘清政治词汇的范畴并非易事。目前国内学者对汉语政治词汇的翻译研究颇多，鲜少有对英语政治词汇汉译策略的专项研究。本文基于对City and Soul in Divided Societies的翻译实践，该书由美国学者斯科特·伯伦斯创作完成，于2012年出版。作者刻画了萨拉热窝、约翰内斯堡、贝尔法斯特、尼科西亚、巴斯克地区、莫斯塔尔、巴塞罗那等九个分布于世界各地正处于分裂状态的地区的情况。书中政治词汇颇多，政治词汇的独特语言特点、政治新词以及政治难词引发笔者思考英语政治词汇的汉译策略。笔者采用文献研究法、在线语料库检索工具Wmatrix以及翻译实践中的具体例证，对政治词汇汉译策略进行深入探讨。笔者使用语料库检索工具Wmatrix 提取了书籍中的430个政治词汇和28个政治缩略词，并进行了量化研究，厘清政治词汇的范畴并总结了政治缩略语的处理方式，此外指出了Wmatrix研究政治词汇的局限性。此外，本文基于City and Soul in Divided Societies的翻译体验，针对社科文本中的英语政治词汇的处理提出两大翻译原则：选词规范严谨和目标读者导向。在上述原则的指导下，笔者将政治词汇分为新兴政治词汇、学科交叉政治词汇、政治外来语、政治多义词四类，针对上述分类提出具体汉译策略为：巧用构词法、交叉学科法、外来语特殊处理法、语义深挖法。通过使用翻译实践的实例对策略进行阐述，说明策略的有效性。希望笔者这些探究能对政治词汇的相关翻译研究有所助益，同时促进和加强中外文化之间的交流和融合。﹀
外文摘要：	︿ In today’s world, important political events often happen and thus new political words always come into being. In 2016 when EU referendum in the United Kingdom and presidential election in the United States happened，English political words such as Post-truth, Alt-right and Brexiteer were so popular that they were chosen as international words of the year by Oxford Dictionary. Political words always are hard to define. Nowadays researches on Chinese political words are many but the specific researches on translation of English political words are few. The author completed the translation task of City and Soul in Divided Societies and used the methods of literature research, corpus retrieval tool Wmatrix and the specific examples in translation practice to explore the translation strategies of English political words in social science works. The author used Wmatrix to retrieve political words and made an analysis of them in order to find the classifications and characteristics of English political words. The author observed that English political words may have links with many other subjects such as wars, ethic and so on. The author proposed two principles of translating political words：strict word choices and reader-oriented. The author classified political words into four types: new political words, interdisciplinary political words, loan words and ambivalent words. The author raised the specific translation strategies for each type, such as word formation, interdisciplinary parallel corpora, special ways for loan words and semantic translation. This paper has a positive impact on the translation of English political words and has some guiding significance for promoting and strengthening the exchange and integration between Chinese and foreign cultures. ﹀
分类号：	H087/TP391
论文总页数：	181
参考文献总数：	46
馆藏号：	017/M2017(650)
公开日期：	2017-05-19

基于行为特征和数据分析的外语词汇学习模型研究.赵海威

链接

题名：	基于行为特征和数据分析的外语词汇学习模型研究
姓名：	赵海威
学号：	1501210805
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
关键词：	二语习得词汇学习数据分析
论文摘要：	︿在母语学习或是二语学习中，词汇习得均为关键步骤之一，对语言习得的重要性不言而喻。如何高效地进行二语词汇学习一直是学界关注的重点。以往的学者们对二语习得中词汇知识的组成部分进行了大量的研究，大多数学者们认为二语词汇习得应该从词汇形态、词汇含义以及词汇用法这三个角度进行学习。随着移动互联网技术的不断发展，应用市场上的单词学习软件层出不穷，移动端教学已扩展至词汇教学领域，在一定程度上扩展了传统课堂词汇教学模式。相比传统意义的词汇学习，移动端用户具有目的性强，使用时间趋于碎片化等特点。因此，在移动技术与词汇教学相结合的今天，有两个问题值得做进一步的讨论：一是如何将词汇知识组成部分的相关理论结合至移动学习流程设计之中，帮助用户高效完成词汇学习并提升词汇能力；二是用户在APP中进行哪些具体操作会对其二语词汇能力变化产生影响。针对以上两个问题，本文以二语词汇知识组成部分的理论为基础，提出了一个针对考研英语学生的词汇学习模型，并将该理念应用于“英语方舟APP”中。为了检验其有效性，本研究对22名APP内测用户的行为进行了为期三个月的观察和记录，通过在实验前后使用改进版的VKS测试以评估用户词汇能力的变化，并结合用户在单词自动发音、总学习时长、平均页面停留时间、添加例句以及查看例句中文等相关操作上的行为数据进行量化分析。此外，在实验结束后，本研究对用户进行了问卷回访，以验证数据分析结果是否有效。通过对用户行为数据的量化分析，本研究发现用户的词汇能力提升与添加例句、查看例句中文释义存在正相关的相关关系；成绩较差组学生的成绩变化与关闭自动发音功能的操作之间存在负相关关系；成绩较差组与成绩较好组与平均页面停留时间分别存在二次回归以及指数回归的关系；全部学生的学习成绩变化与学习总天数无显著的相关关系。回访问卷分析对上述结果进行了分析，并在此基础之上，进一步对“英语方舟APP”的产品设计以及教学理念提出了相关改善建议。﹀
分类号：	H319.3/TP391.7
论文总页数：	86
参考文献总数：	0
馆藏号：	017/M2017(662)
公开日期：	2017-05-19

常见易错搭配辨析及预测.耿思思

链接

题名：	常见易错搭配辨析及预测
姓名：	耿思思
学号：	1401210556
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
外文题名：	Critical Analysis and Prediction of Collocations Errors
关键词：	错误搭配辨析预测
外文关键词：	collocation error critical analysis prediction
论文摘要：	︿搭配在英语学习中占据重要地位，但研究表明，学习者往往在搭配知识上存在不足，从而导致他们在产出时用错字词。鉴于此，国内外学者对错误搭配的研究一直在进行，其研究方向也比较丰富，比如归类分析错误、探寻出错原因、自动批改错误等等，但这些研究整体来看都将关注点放到学生所犯错误上，却从未考虑提前预测错误搭配并将其用于教学；另外，上述研究中所用的错误搭配大都根据教师的直觉进行标记，但很多直觉上的错误在英语语料库中其实是存在的，如“ learn knowledge” 截至目前在NOW（ News On the Web）语料库中共出现了 12 次，在出现的语境中表达的意思正是 “学习知识”，由此看来，“ learn knowledge”并非错误，只是相对“ acquire knowledge”而言更不地道而已。据此，本文旨在讨论三方面的问题：第一，错误搭配的提取和分析。根据相关研究，在错误搭配中，动词和名词、形容词和名词、名词和名词类占了大多数，因此笔者从中国英语学习者语料库（ Chinese Learner English Corpus，简称 CLEC）中收集整理了 713个相关类型的错误搭配，然后利用 google 搜索引擎和语料库对某些“众所周知” 的错误进行辨析，证明随着时间流逝，一些习以为常的“中式表达” 正慢慢进入英语语料，虽相比地道表达数量较少，但它们仍然存在，因此应采取辩证的态度看待这些错误；第二，错误搭配词来源探讨。在“直翻” 和“过度泛化” 方式的指导下，利用 thesaurus、中英词典及《牛津搭配词典》对错误搭配词的来源进行了新的探索，发现错误搭配词来源于四个方面：直翻词、直翻词的同义词、搭配组词及同义词的搭配组；第三，易错搭配预测。主要综合“来源”和错误搭配词“特征”对判定易错搭配的条件进行讨论，并利用得出的条件进行预测，最终得到大于20,000 条易错搭配。最后，为了对其应用性进行说明，本文以单词教学为例设计了具体的教学情境，并邀请 20 名学生和 20 名老师对其评价，评价结果显示，大部分人（ 19/20）认为本文设计的教学情景能帮助学生更好地运用单词。﹀
外文摘要：	︿ For English learners, the appropriate use of collocation is the key to making natural and native sentences, which are elements of compositions. Considering its significance, researchers at home and abroad are being involved in it. They usually adopt wrong analysis method, focus on the type of wrong collocations made by EFL and finally put forward some suggestions.However, few researchers study how to predict collocation errors and how to apply them to learning activities. Besides, few papers discuss how to decide if a collocation is wrong.Researchers tend to identify errors by intuition, but intuition does not always work. For example, "learn knowledge", as a well-known error, has appeared 12 times in NOW (News On the Web) corpus. In the sentence of "You've changed a lot in college—not just to learn knowledge, but more importantly made the way of thinking, creativity and communication.", the collocation exactly conveys the meaning of “学习知识(xue xi zhi shi)”. Thus we cannot conclude that “learn knowledge” is wrong but only suggest it is not widely accepted by native speakers. According to the facts above, the paper discusses three problems: critical analysis of wrong collocations, source of incorrect collocates (verbs, adjectives and nouns) as well as prediction and application ofcollocation errors students tend to make. For the first one, we will use NOW (News On the Web) corpus and google search engine to analyze several “well-known” errors.The paper aims to show readers that with the passage of time, some "Chinese expressions" have been accepted by native speakers. Thus a critical attitude is required towards those errors. For the second one, we accept the methods of direct translation and over-generalization to find the source of wrong collocates. Based on 713 errors we collect from CLEC (Chinese Learner English Corpus), the paper finds that incorrect collocates come from four aspects: direct words, synonyms of direct words, collocates in collocation clusters and collocation clusters of synonyms. And in the paper the direct words come from iciba API, the synonyms from Thesaurus.com, the collocation clusters from Oxford Collocations Dictionary. For the third one, we find that many conditions can be used to predict collocation errors students usually made. And we choose two of them and finally get >20,000 items. With those items, the paper designs a learning unit, which aims to promote vocabulary output of students. At last, 20 students and 20 teachers are invited to evaluate the unit and the result shows that most interviewees (19/20) think it can promote the acquisition of productive words. ﹀
分类号：	H087/TP391
论文总页数：	78
参考文献总数：	41
馆藏号：	017/M2017(736)
公开日期：	2017-05-19

面向技术文档写作修改过程的书面沟通优化方法的设计与实现.林梦姣

链接

题名：	面向技术文档写作修改过程的书面沟通优化方法的设计与实现
作者：	林梦姣
学号：	1401210635
专业：	计算机技术
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师姓名：	俞敬松
导师单位：	软件与微电子学院
答辩日期：	2017-05-19
关键字(中文)：	技术文档沟通邮件草稿错误
文摘：	︿开发英文技术文档时，技术工程师、程序员（以下简称技术人员）负责撰写英文草稿，技术写作者负责对英文草稿进行修改。双方各司其职、通力合作，最终交付正规的英文技术文档。国内许多公司之所以采取这样的工作模式，是因为技术写作者的专业技术知识不足，而技术人员的英文写作能力又欠佳。这种工作模式往往引发两个问题。第一，草稿的质量被高估。第二，草稿修改过程中，技术写作者与技术人员的沟通质量参差不齐。为了更好地完成草稿修改的工作，技术写作者应该通过自己的努力，提高沟通效率，进而改善草稿的修改质量。因此，本文提出一套面向技术文档写作修改过程的书面沟通优化方法，以邮件沟通为切入点，由技术写作者主动地优化邮件写作方式，改善双方的沟通，最终提高草稿的修改质量。笔者先研究技术人员与技术写作者就草稿的错误进行沟通时，技术人员对于技术写作者主动发起的沟通是否满意。然后，研究技术写作者主动发起沟通时，其撰写的邮件质量如何，邮件当中存在哪些问题。最后，根据研究而得的沟通问题，提出改善的方案，分别从邮件标题、错误位置标注、错误叙述的三个方面改善邮件写作方式，优化技术写作者与技术人员的沟通，最终实现提高草稿修改质量的目的。为了验证面向技术文档写作修改过程的书面沟通优化方法的有效性，本文进行了技术写作者一边使用邮件与技术人员沟通草稿错误、一边修改草稿的对比实验。实验历时四周，以10名初级技术写作者为实验对象。实验组接受书面沟通优化方法的培训，而对照组不接受培训。通过前测、后测以及访谈的形式收集数据，从定量和定性的角度分析笔者设计的书面沟通优化方法的有效性。通过数据分析，笔者得出的结论是，笔者设计的面向技术文档写作修改过程的书面沟通优化方法，能起到提高草稿修改质量的作用。﹀
分类号：	H087/TP391
论文总页数：	87
参考文献数：	0
馆藏号：	017/M2017(820)
公开日期：	2020-05-19

西方服饰术语的翻译策略——以《现代时尚历史：1850-2010》为例.李亚楠

链接

题名：	西方服饰术语的翻译策略——以《现代时尚历史：1850-2010》为例
姓名：	李亚楠
学号：	1401210619
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
关键词：	西方服饰术语术语翻译翻译策略
论文摘要：	︿服饰与人类生活密切相关，目前国内对于服饰的翻译研究虽然已经取得一定成果，但主要以中国传统服饰的外译为研究对象。近代以来，中国深受西方服饰文化的影响，然而针对西方服饰术语的汉译问题仍然缺乏深入的讨论，也缺乏系统性的翻译策略。本文基于《现代时尚历史：1850-2010》（The History of Modern Fashion: From 1850）的翻译实践，该书按时间顺序梳理了英国、法国、美国等西方国家从1850年至2010年女装、男装、童装的发展变迁，其中涉及大量西方服饰术语。本文详细探讨了西方服饰术语翻译过程中所遵循的原则以及所采用的策略，以期为同类书籍的翻译提供借鉴，同时对西方服饰术语的汉译研究有所贡献。针对西方服饰术语的自身特点以及翻译书籍的性质，本文提出西方服饰术语翻译的三大原则，认为西方服饰术语的翻译不必像其它学科术语翻译那样一味追求字面上的准确对应，大量采用生硬音译更不可取。相比而言，翻译的经济趣味性和便于读者认知更为重要。在此三大原则的基础上，提出四条术语翻译策略：以文献为基础、借鉴中国服饰术语命名理据、以图片为辅助以及适当补偿信息，并且结合具体实例对这些策略进行了详细的探讨，验证了策略的有效性。笔者对西方服饰术语翻译的理解和本文的挖掘深度仍有待提高，期待有更多感兴趣的学者能够对此领域进行研究。﹀
分类号：	H087/TP391
论文总页数：	272
参考文献总数：	41
馆藏号：	017/M2017(457)
公开日期：	2017-05-19

异语写作的无本回译研究——基于《中国营养疗法：中医营养学》一书的翻译.陈培琳

链接

题名：	异语写作的无本回译研究——基于《中国营养疗法：中医营养学》一书的翻译
姓名：	陈培琳
学号：	1401210520
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
外文题名：	Textless Back Translation Study in Foreign Language Writing
关键词：	中医异语写作无本回译
外文关键词：	Traditional Chinese medicine Foreign language writing Textless back translation
论文摘要：	︿异语写作，即语言和文化非同源的创作。以中国文化为例，用英文向西方读者介绍中国文化的作品就属于异语写作。当这类作品译回中文时，便产生了无本回译，即没有原本可做依据的回译。本文基于德国作者卡斯特纳的《中国营养疗法：中医营养学》（Chinese Nutrition Therapy: Dietetics in Traditional Chinese Medicine）一书的翻译实践，研究中医的异语写作以及无本回译。针对回译实践中遇到的翻译问题，提出翻译的原则和策略。笔者分别从拼音的使用、词汇的回译和引文的回译三个方面分析无本回译的问题。第一，拼音的使用。笔者研究了拼音系统混用的情况、中医中拼音的使用并指出原书中出现的多处拼音拼写错误，提出回译可对拼音进行检测。第二，词汇的回译。本次翻译的书籍中涉及大量中医术语的翻译。中医术语的表述跟日常汉语有所不同，翻译时需要谨慎对待，通过参考专业书籍或者搜索互联网以求实现“至译”。另外，笔者还分析特殊词汇，如“cold”的回译，针对五种不同语境，提出了回译的建议。第三，引文的回译。笔者自建了《黄帝内经》的简易语料库，通过准确翻译关键词、对英文表述直译、语料库或互联网检索关键词和根据直译的中文表述对检索结果进行筛选四步，进行引文的回译。此外，对于来源信息不详的引文，回译时需要尽可能多的搜集上下文信息，锁定可能的典籍范围，然后查阅典籍，进行验证。对于无法找到原文出处的引文，则采用译者自译的方法，但需仿造文言文形式、去掉引号并进行备注。笔者通过对中医异语写作的回译研究，提出了归化与异化并用、系统化原则和受众主导的回译原则，以及制作术语表、分析回译场、借助语料库和省译的回译策略。本文的研究仅是对中医异语写作的无本回译研究的初试，希望未来学者能进行更加深入和广泛的研究，以弥补本文的不足之处。﹀
外文摘要：	︿ Foreign language writing means that language and culture of the same work are of different nations. Taking Chinese culture for example, works that introduce Chinese culture to western readers in English are so called foreign language writing. When such works are translated back to Chinese, it is textless back translation, which means there is no original text for reference. Based on the translation practice of Chinese Nutrition Therapy: Dietetics in Traditional Chinese Medicine, this paper studies foreign language writing and textless back translation of traditional Chinese medicine(TCM). According to problems encountered in the process of the translation, the author discusses translation principles and strategies. The author analyzes the problems of back translation from three aspects which are the use of Chinese Pinyin, the translation of special nouns (such as terminology of TCM), and citation. The author puts forward three principles which are the combination of domestication and foreignization, systematical principle and audience dominant principle, and also four translation strategies which are making glossary, analysis of the back translation field, making use of the corpus and method of omission. This paper is only a preliminary study of the textless back translation of foreign language writing of TCM, and, needs more in-depth and extensive research in the future to make up for the shortcomings of it. ﹀
分类号：	H3
论文总页数：	245
参考文献总数：	0
馆藏号：	017/M2017(563)
公开日期：	2017-05-19

中美高校国际形象片叙事对比研究.段夕超

链接

题名：	中美高校国际形象片叙事对比研究
姓名：	段夕超
学号：	1401210543
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
外文题名：	A Comparative Study on the Narration of the International Promotional Videos of Chinese and American Universities
关键词：	高校国际形象片叙事差异叙事建议
外文关键词：	University’s International Promotional Video Narrative Differences Narrative Strategies
论文摘要：	︿中国高校国际形象片传播效果欠佳，先有网友批判北京大学国际形象片和耶鲁大学国际形象片相去甚远，后有复旦大学国际形象片抄袭东京大学形象片创意被国内外诟病，这一前一后凸显了国内高校国际形象片的不足。虽然问题显著，但相关研究凤毛麟角。本文尝试找出国内高校国际形象片的问题所在，以期改善其传播效果。美国高校国际形象片的传播效果优于中国高校国际形象片，且两者在叙事上存在明显差异，本文假设其叙事差异在一定程度上影响了两者的传播效果。为验证假设，本文以北京大学形象片《北大欢迎你》和哈佛大学形象片《哈佛，一切皆有可能》为例，结合其他48部中美高校国际形象片，借助AntConc、LIWC等工具对中美高校国际形象片的叙事进行定性和定量分析，旨在明确以下3点问题：从叙事角度看，中美高校国际形象片存在哪些差异？这些差异是否对形象片的传播效果有影响？中国高校国际形象片在叙事层面可以作何改进？通过分析，本文在叙事角度总结出体现中美高校国际形象片差异的8对叙事差异：功能性人物观和心理性人物观、单一性和多样性、重低层次需求和重高层次需求、重知识传受及重知识探索和实践、重学术和学术艺体并重、重国家使命和重个人发展，第三人称视角和第一人称视角，语言真实度、可读性低和语言真实度、可读性高。随后，笔者对国外受众针对上述叙事差异进行调查，验证了中美高校国际形象片的叙事差异的确对其传播效果有影响，美国高校国际形象片的叙事方式更容易为受众所接受。最后，笔者尝试从宏观和微观两个层面对上述8对叙事差异产生的原因做了一定解读。并参照分析，就国内高校国际形象片应该讲什么故事，应该如何讲故事提出6点建议，希望能为国内高校国际形象片的制作提供参考。﹀
分类号：	H087/TP391
论文总页数：	78
参考文献总数：	47
馆藏号：	017/M2017(567)
公开日期：	2017-05-19

基于语义场的金融词汇翻译策略——以《股权众筹投资指南》为例.鲁晨

链接

题名：	基于语义场的金融词汇翻译策略——以《股权众筹投资指南》为例
姓名：	鲁晨
学号：	1401210663
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
关键词：	金融词汇语义场词汇翻译
论文摘要：	︿金融英语是一种专门用途的英语，其词汇与人们的生活息息相关，很多来源于普通英语，具有对义性、反义、上下义的特征，而这些特征正是语义场的概念范畴。目前语义场理论运用在翻译中的并不多，大多数考虑的还是语义场的聚合关系，组合关系和联想关系在翻译中探讨的相对较少。语义场理论认为词与词之间存在聚合关系、组合关系和联想关系。反映在金融领域，聚合关系包括上下义、对义、反义、近义、整体与部分、递进等关系，词项的词性上具有统一性；组合关系则包含名词与名词、名词与动词、形容词与名词、连字符与名词等组合成的语义场，场中的词项具有相同的核心词；而联想关系则不受词性和核心词的限制，可以将聚合词项、组合词项以及其他通过联想产生关系的词项都包含进来，让金融词汇语义场的范围更加完整。正是金融词汇的聚合关系、组合关系和联想关系让语义场具有了层次性、多样性、变化性和民族性的特征。只有把握金融词汇语义场的关系和特征，才能准确界定出金融词汇的含义。结合《股权众筹投资指南》的翻译总结，笔者ᨀ出关系对应法、关系结合法、关系平移法和关系隐藏法的翻译策略来帮助译者界定金融领域专业词汇和普通词汇的含义。由于笔者所翻译书籍中涉及到大量的专业金融词汇和法律词汇，对相关知识的了解尚有欠缺，希望未来学者能更加准确和深入地分析金融词汇语义场中的关系和特征。﹀
分类号：	H087/TP391
论文总页数：	212
参考文献总数：	0
馆藏号：	017/M2017(592)
公开日期：	2017-05-19

翻译修改过程中技术审校和语言审校的对比研究.李雅琦

链接

题名：	翻译修改过程中技术审校和语言审校的对比研究
姓名：	李雅琦
学号：	1401210618
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2017-05-19
关键词：	翻译修改过程技术审校语言审校
论文摘要：	︿计算机科学发展迅速，同时也催生了许多新兴学科，“大数据”就是其中之一。国内出版社或翻译公司引进此类书籍的外文版数量越来越多，对译文的审校原则上应该是语言审校和技术审校的融合，两个环节缺一不可。但实际情况是很多出版社由于人员、经费不足或时间紧张等问题，不重视审校，使得审校环节变得可有可无，能简则减。在翻译出版具有专业性的应用型书籍时，出版社主要采取由语言背景译者进行翻译，再由相关技术背景的编辑进行审校；或者由技术背景的译者进行翻译，再由语言背景的编辑进行审校；甚至存在由译者自己审校，也就是文责自负的情况。由于审校流程和质量没有统一的规范，导致大数据类书籍的出版效率低下，出版发行的书籍质量参差不齐。本文基于《数据人格化》和《数据决策化》两本大数据类书籍的翻译和审校过程中出现的具体案例，参考问卷访谈的结果，采用定性和定量的方法对比分析语言审校和技术审校修改过程中的侧重点，对待各类翻译错误的态度、审校过程中存在的问题、以及审校过程参与主体之间出现的分歧。例如，分析结果表明，语言审校和技术审校在词汇的处理上，漏译现象上都能帮助提高译文质量，但对待翻译腔、增译和意译的态度有所不同，识别原文本错误的能力也不同。此外，语言审校和技术审校存在审校无效、审校错误和审校不足的问题，这些问题会对翻译质量和效率产生负面影响。翻译审校是一个关键、复杂的过程，也是保证译文质量的最后一道防线。相对于越来越成熟的翻译流程和规范，翻译审校的规范化进展缓慢。笔者针对实践中出现的问题，提出了翻译审校的改进策略，严格筛选审校人员，加强审校培训，提高翻译审校者的能力；规范审校流程；制定审校要求并明确出现分歧时话语权的归属。通过具体的案例分析，希望能给具有一定专业性书籍的翻译审校提供借鉴。﹀
分类号：	H087/TP391
论文总页数：	238
参考文献总数：	38
馆藏号：	017/M2017(762)
公开日期：	2017-05-19

基于国内创业者需求的海外创业公司新闻编译策略研究.梁欣

链接

题名：	基于国内创业者需求的海外创业公司新闻编译策略研究
姓名：	梁欣
学号：	1401210629
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2017-05-19
外文题名：	Research on Transediting Strategy of Overseas Startups News Based on Domestic Entrepreneurs' Needs
关键词：	创业公司新闻信息需求新闻特点编译策略
外文关键词：	Startups news Information needs News characteristics Transediting strategy
论文摘要：	︿随着我国创业热潮的兴起，国内创业者迫切需要更多了解海外创业信息，国内许多媒体平台已经开始大量编译海外创业公司新闻为国内创业者服务。目前学术界对于创业公司新闻的研究较少，特别是编译方向。调研发现，国内新闻媒体对于创业读者群体的需求不明确，相关调研数据较少，缺乏系统的编译策略。针对当前创业新闻媒体的问题，本文通过读者访谈与问卷调查的方式调研创业读者的需求，并结合创业公司新闻的特点，提出具有针对性、系统性的创业公司新闻编译策略。本文首先通过文献研究与对比分析，讨论创业公司新闻的范畴和特点，分析创业公司新闻编译的独特性。其次，基于前期的读者访谈，提炼出问卷要点，面向国内创业读者群体发放大量调研问卷，并对读者需求进行统计分析。最后，针对业界媒体的编译现状，归纳当前编译流程中编译者碰到的三大问题，提出系统的创业公司新闻编译策略，并通过读者评分的方法论证本文所提策略的有效性。本文的调研发现读者需求具有求新、求快、求异和求学的特点，读者希望在尽可能短的时间内，读到并读懂所需要的全部创业公司新闻信息。本文结合读者需求和创业公司新闻的特点，提出系统的创业公司新闻编译策略——包括信息源优化策略、信息结构化策略和编译技巧三个方面。信息源优化策略提出了四个信息源选择方向和十种信息源类型；信息结构化策略指出创业公司新闻除了包含新闻五要素以外，还应包含创业公司新闻的八个基本信息点；编译技巧包括不译策略、引用策略、客观化策略和新媒体辅助策略。策略验证结果表明，基于读者需求和创业公司新闻特点所提的编译策略更系统全面，其译文更受读者欢迎。﹀
外文摘要：	︿ With the boom of domestic startup companies, entrepreneurs desire to know the latest startups information abroad, which may affect their business decisions. Domestic media has started transediting overseas startups news for entrepreneurs. Few researches on startups news have been done, neither have startups news transediting jobs. Surveys found that domestic media have little knowledge of domestic entrepreneurs’ needs. Apparently, current startups news transediting works can't meet the needs of domestic entrepreneurs. Considering the problems of current startups news media, the paper, through interviews and questionnaires, investigates the needs of domestic entrepreneurs when reading startups news and puts forward systematic and fully-targeted transediting strategy based on the characteristics of startups news itself. Firstly, by literature research and comparative analysis, the paper illustrates the scope of startups news, its characteristics, and how it is unique in terms of transediting. Secondly, based on interviews, the paper summarizes the survey problems and collects a large number of questionnaires from domestic entrepreneurs for statistical analysis, summing up the needs of domestic entrepreneurs. Finally, according to the transediting status of the startups news media, the paper summarizes three transediting problems encountered by transeditors in the transediting process, putting forward systematic transediting strategy of startups news. In the end, the study validates the effectiveness of the strategy by readers' grading assessment. According to the investigation, the paper found that domestic entrepreneurs like reading startups news that are novel, timely, different and informative. They hope to read and acquire information within the shortest timespan. Based on the needs of readers and the characteristics of startups news, the study puts forward the transediting strategy, including information sources optimizing strategy, information structuring strategy and transediting skills. The information sources optimizing strategy talks about four directions and ten types of sources to find startups information. The information structuring strategy explains that startups news contains five basic elements that are also shared by ordinary news and another eight elements owned by startups news. Transediting skills include non-translation strategy, quotation strategy, objectification strategy and new media aided strategy. Assessment result shows that the startups news transediting strategy, which is put forward based on its characteristics and readers' needs, is more comprehensive; and the readers are more satisfied with its transedited works. ﹀
分类号：	H087/TP391
论文总页数：	104
参考文献总数：	51
馆藏号：	017/M2017(755)
公开日期：	2017-05-19

2016-11-30

出入境货运船舶边检管理风险评估系统的设计和实现.陈嘉慧

链接

题名：	出入境货运船舶边检管理风险评估系统的设计和实现
姓名：	陈嘉慧
学号：	1301221445
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-11-30
关键词：	边检管理货运船舶风险评估
论文摘要：	︿近年来，随着社会的进步发展，越来越多的科学技术逐渐应用到边防检查业务工作中，出入境边防检查工作的科技含量和信息化、现代化水平显著提高。这些科学系统的应用成为支撑“大进大出”口岸格局的关键支柱，是出入境边防检查管控与服务工作的重要保证。出入境海港业务依靠现有的“网上报检”和视频监控平台，在缩短出入境人员报检时间，对出入境船舶进行巡查监控方面取得了显著效果；但仍然存在一些不足，例如在管控方面,船舶的风险评估标准不统一，评估结果简单划分为高中低三个风险等级，对应三种不同的巡查监护措施；分类方法简单，应对方法单一，存在主观判断标准。针对以上不足，本文以货运港口岸的出入境船舶为研究对象，采用现场勘查、过程分析、专家认证、案例分析研究等方法对船舶边检管理存在的风险进行识别，提炼出影响船舶边检管理的风险因素；采用情景分析、业务影响分析、人因可靠性分析和风险矩阵图等形式对存在的问题进行定性定量分析，将风险因素数值化和具体化，形成风险评估因子；在保持原有风险等级划分标准的基础上，细化并统一评估标准，根据船舶不同风险等级和停靠地点制定不同的检查监管措施；最后依托计算机技术支持，建立一套风险评估系统。 “出入境货运船舶边检管理风险评估系统”根据风险评估结果，区分船舶种类和停靠地，对船舶实施分级分类服务管理，使管控目标更有指向性，警力投放更有针对性。该系统的开发将为海港边检机关执勤队提供更科学统一的决策依据，缩短船舶风险评估处理、流转时间，进一步提升工作效率，在合理安排警力的同时提高检查、巡查、监督和服务的效果,实现船舶边检管理风险评估的便捷化、规范化、联动化。﹀
分类号：	TP3
论文总页数：	69
参考文献总数：	0
馆藏号：	017/M2016(1195)
公开日期：	2016-11-30

体裁分析视角下南海事件新闻报道研究.韩易菲

链接

题名：	体裁分析视角下南海事件新闻报道研究
姓名：	韩易菲
学号：	1301210649
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	柏晓静
导师1单位：	清华大学
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2016-11-30
外文题名：	A Genre-Based Analysis of Chinese and American News on the South China Sea Issue
关键词：	体裁分析语步结构南海事件新闻语料库
外文关键词：	Genre analysis Move structure South China Sea news Corpus
论文摘要：	︿南海问题由来已久，与此同时，南海局势升级，美菲不断宣传中国威胁论，美国出于某种战略目的强势介入干预，使南海问题国际化程度进一步加深，在种种复杂态势下，中国主权诉求在国际社会很难得到支持。而南海事件不仅关系国家主权利益，也涉及中国形象的塑造。因此，南海事件英文新闻报道对于国际舆论中国形象的杠杆作用十分重要。美菲利用媒体舆论斗争来争取国际支持，积蓄政治的主动性。面对这些问题，中国英文新闻报道在南海事件对外传播中如何针对受众提高传播效果，制定更长远的对外传播策略，提升中国形象至关重要。本文从体裁分析的视角入手，针对新闻语步结构和语言策略提出南海事件中国英文新闻报道的写作建议。本研究选择中美主要媒体CCTV－news、China Daily、Beijing Review、CNN、New York Times、The Washington Post，从中选取约28万词的南海事件英文新闻报道建立中美新闻语料库各一个。首先，本研究从宏观语步结构出发，基于前人新闻体裁结构模型，结合南海事件新闻报道特殊的交际目的，新闻报道观点的鲜明性、突出的战斗性、对抗性、冲突性，报道结构的多层次、复杂化特点，创新性地提出了新的语步步骤，其中包含属性、详细报道及次要事件语步，以及步骤直接引语、间接引语、媒体评论、官方评论、视觉辅助、其他细节、直接结果、间接结果、预测性结果，通过先导试验验证可行性，制定出南海事件新闻语步结构模型。本文通过定性定量对比200篇中美关于南海事件新闻报道中具体语步和步骤的异同，解释分析其传播功能并对中国关于南海事件英文新闻写作提出语步结构参考建议。其次，本研究从微观语言策略入手，尝试探究不同意识形态下语言策略的异同。通过对比中美新闻热点词分析双方关注点，美国新闻更关注整个亚太安全态势，及海牙国际仲裁法院裁定合法性等，构建中国霸权形象，其热点词体现出南海新闻的战斗性和对抗性，且美国热点词已形成较完善的南海新闻话语体系。中国热点词更关注国家主权，态度鲜明，力求和平解决冲突。通过态度情感词分析透视出中美南海新闻倾向性，美国以中立报道居多，正面、负面报道数量较为平衡，看似客观。中国则主要为正面报道，负面报道极少，难免使受众认为报道不够客观中立。尽管中美引语主体均已较为多元化，但中国报道引语使用频率低于美国，且中国直接引语、间接引语主要来源与美国恰恰相反，美国直接引语多源自本国，隐蔽地向受众渗透其立场观点，并使报道表面上更具说服力，提高受众对新闻的可信度。针对这些问题本文为我国英文新闻写作提出相应的语言策略参考建议。﹀
分类号：	H087/TP391
论文总页数：	88
参考文献总数：	0
馆藏号：	017/M2016(1190)
公开日期：	2016-11-30

2016-05-31

基于语料库的科技文翻译腔研究.张琳

链接

题名：	基于语料库的科技文翻译腔研究
姓名：	张琳
学号：	1301211084
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-31
外文题名：	A Corpus Study on Translationese in Scientific Text
关键词：	科技文翻译翻译腔语料库
外文关键词：	Tranaltionese Scientific Text Corpus Linguistic
论文摘要：	︿随着中外科技交流日趋频繁，科技文的翻译需求不断增大，译文质量也受到广泛关注，其中，影响科技译文质量的最重要的因素之一就是翻译腔。众多研究表明，翻译后的文字与原创文字有着众多不同，这种词汇、语法方面的不同即被统称为翻译腔（Translationese）。前人对科技文中的翻译腔的研究大多停留在定性研究上，而近年来，语料库方法成为研究语言的重要工具，对科技文中的翻译腔研究带来了新的视角。本文采用了语料库研究方法，自建科技文译文语料库和科技文原创语料库，以北京大学CCL双语语料库为参考语料库，从简易性、清晰性、标准性、干扰性和其他五个方面入手，以句子长度、类符形符比、实词比例、连词比例、定语长度、字母词、词缀、被动句、冠词、英语固定句型、标点符号、换算单位为观察点，深入分析了科技文中翻译腔的表现形式及其特殊性。研究表明，科技译文有着句读较长、词汇较啰嗦、关联词多、定语长、被动句多、破折号使用多、换算单位欧化的特点，并且受英语冠词a/an、固定句型的影响较大。值得注意的是，本研究发现，破折号的使用和换算单位欧化属于科技文中的特有翻译腔现象，而字母词、词缀的翻译腔现象在科技文中却已经成为一种趋势，正在被汉语慢慢吸收和融合。在对科技文中的翻译腔现象进行了定性和定量分析之后，本文结合《大数据训练营》一书的翻译实践，认为应该辩证的看待科技文中的翻译腔，对于可被接受的翻译腔保持宽容和接纳的态度，对于不可接受的翻译腔，本文也从简易性、清晰性、标准性、排除干扰、标点符号规范化、换算外来单位六个方面提出了相应的翻译策略。﹀
外文摘要：	︿ As China put more emphasis on science and technology, scientific text become one of the most popular genres in the market, and more scientific texts have been translated into Chinese. Accordingly, the quality of scientific translation received extensive concern. The biggest influencing factor of translated scientific text is translationese. Translationese means that the translated text and the original non-translated text have several differences on morphology and syntaxal. Most of the previous researches are qualitative analysis. But as the development of Corpus Linguistics, a new method has been introduced to analysis translationese. Taking Center for Chinese Linguistics PKU as reference corpus, our study constructs two corpus——translated scientific text corpus and original scientific text corpus. We define tranlationese from five aspect——simplification, explicitation, normalization, interference and others. Through corpus analysis of sentence length, attribute length lexical density, punctuation, affix, modifier, indefinite article, passive sentence, unit, the frequency of conjunction and notional words, we find out special features of tranlationese in scientific text. According to our study, there are several features of tranlationese in scientific text. For example, the overuse of conjunction, passive sentence and indefinite articles, the modifier are too long and so on. Overuse of dash and the non-conversion of units are the most noteworthy features of tranlationese in scientific text. Meanwhile, we also find that English words and affix, which are also forms of tranlationese, have been accepted by Chinese. After quantitative and qualitative analysis of tranlationese in scientific text, combining with author’s translation of a scientific book named Big Data Bootcamp, we talk about how to deal with tranlationese dialectically. For those translationese who have positive effects and have been accepted by Chinese, we should keep tolerant. But for those who have negative effects on Chinese, we should try to avoid them. So we give advises to translators from six aspect according to the corpus data and translation practice. ﹀
分类号：	H087/TP391
论文总页数：	196
参考文献总数：	0
馆藏号：	017/M2016(988)
公开日期：	2016-05-31

Dota类对战游戏台词翻译策略研究.宋军

链接

题名：	Dota类对战游戏台词翻译策略研究
姓名：	宋军
学号：	1101210879
专业：	软件工程
公开时间：	2年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-31
关键词：	情景语境文化语境译者创造性和约束语言特点游戏台词
论文摘要：	︿随着越来越多的国外网络游戏进入中国，国内网络游戏市场得到快速发展。网络游戏的本地化翻译工作变得越来越重要，它直接影响着游戏玩家的游戏体验。然而，国内对于网络游戏翻译的研究较少，译者在翻译时通常缺乏方法论的指导，翻译质量良莠不齐。在本地化翻译实践中，游戏台词的翻译通常是以影视、戏剧台词的翻译为参考。但与影视台词翻译相比，游戏台词翻译又有许多不同之处。首先，游戏台词翻译（本文专指Dota类游戏台词翻译）不受上下文语境的约束，游戏台词具有一定的独立性。翻译时就需要更多地考虑游戏台词的语言特点、文化语境因素、情景语境因素等；其次，Dota类游戏台词由于采用了配音台词的展示方式而不像影视台词翻译受到严格的时间和空间的限制。最后，受众视角的差异也影响着游戏台词的翻译。《英雄联盟》是Dota类游戏的代表作品之一。本文以其游戏台词为研究对象，研究游戏台词语言特点，语境特征，译者的创造性与约束怎样影响译者选择翻译方法和制定翻译策略。本文在参考影视台词翻译方法的基础上，结合游戏台词的自身特点，以游戏台词的语言特点、游戏台词的语境特点、译者的创造性和约束为研究内容来探索此类游戏台词的翻译策略。游戏台词的翻译不仅要以归化、异化的翻译理论为指导，还要结合游戏台词翻译的特殊性来进行翻译。在本文研究过程中，笔者以游戏台词的语言特征、语境特征和译者的创造性和约束理论为研究内容，结合翻译实例和调查问卷分析来探索游戏台词的翻译策略。研究表明，以游戏台词语言特点、语境因素和译者的创造性和约束为依据制定的翻译策略可以基本上满足游戏台词译者的翻译需求。﹀
分类号：	H087/TP391
论文总页数：	47
参考文献总数：	0
馆藏号：	017/M2016(1019)
公开日期：	2018-05-31

科学修辞劝说视角下的科普汉译策略——以 Debunk It 为例.孙庆娟

链接

题名：	科学修辞劝说视角下的科普汉译策略——以 Debunk It 为例
姓名：	孙庆娟
学号：	1201210789
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-31
关键词：	科学修辞科普翻译劝说翻译策略
论文摘要：	︿现在生活中充斥着很多科学谣言或者虚假信息，加上所谓的“专家”及“研究机构”的迷惑，人们很难从纷繁的信息中筛选出有效可靠的信息。“辟谣”类科普应运而生，关注的话题多种多样，满足了普通大众开阔视野、明辨是非的需要。但目前总体来讲，英文科普书籍引进了很多，但翻译质量也有好有坏，对科普文本的修辞及翻译策略研究较少。本次拟翻译的书籍为约翰·格兰特的《揭露真相》，出版于 2015 年。此书主要揭露了一些常用的伪科学常用手段，讲述了如何运用批判性思维分辨虚信息，并给出了一份胡话快速检查单。此书有多种修辞安排，例如频繁使用第一、二人称，强调手段多种多样，给人一种面对面交流的感觉。本翻译报告从科学修辞劝说理论入手，针对科普文本的科学性、普及性、趣味性和人文性的特点，译者需有修辞劝说意识，尽量在译入语中建立修辞者和修辞受众的联系。在此基础上，提出“简单”和“认同”两大翻译原则，并总结出选用中立词汇、根据目标受众特点选择词汇、运用多重强调手段、补充事实内容等几项策略，便于译者更好地了解原作者的劝说目的，找到适合的表达方式，翻出适应目标读者的译本。﹀
分类号：	H315.9
论文总页数：	198
参考文献总数：	0
馆藏号：	017/M2016(1022)
公开日期：	2016-05-31

移动网页式说明书中不同程度渐进式呈现及其效果研究.朱灿华

链接

题名：	移动网页式说明书中不同程度渐进式呈现及其效果研究
姓名：	朱灿华
学号：	1201211059
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-31
关键词：	渐进式呈现技术文档移动网页式说明书概念性内容程序性内容
论文摘要：	︿智能手机现在已经成为人手必备的生活、工作工具，人们开始在手机上做任何事情：社交、购物、娱乐。而对于技术传播行业来说，移动终端也成为了一个新的技术文档发布平台，随着智能设备的飞速发展，移动技术文档带来了新的发展契机，同时也面临着新的问题和挑战。面对移动可用性问题、用户使用移动设备的行为特点以及阅读说明书时候的偏好、态度和体验，传统的网页或单纯响应屏幕大小的移动网页显然不能满足需求。交互设计领域的渐进式呈现（Progressive Disclosure，以下简称PD）理念为解决上述问题的许多关键方面提供了契机，尤其是DITA的灵活结构和PD的结合有着很大的可能性。在技术写作领域PD的应用和研究存在着很大的空缺，本研究尝试将交互设计的PD理念应用到说明书，尤其是移动网页式说明书内容的呈现上，并就其效果进行实证研究。本研究首先应用了Andrea Ames 提出的“用户阶段”和“信息等级”以及后者与技术文档内容的对应关系做出了说明书内容是否适合PD处理的第一级划分。接着应用了说明书程序性内容的四元模型，以技术写作结构DITA为例与之对应，针对这些内容和对应的DITA元素是否适合PD处理进行了德尔菲法专家咨询，形成说明书内容的PD二级划分。专家就这些内容是否适合PD处理最后达成了较为一致的意见，并表示PD是提升说明书移动阅读体验的一种有效方式。在上述两级划分的基础上，笔者设计出全PD版、半PD版和无PD版三个版本的SDL Trados Studio软件使用说明书并就其使用效果实施了对比试验。试验结果证明对说明书内容进行适当的PD处理可以提高说明书的可用性、用户满意度和参与度，并且能降低用户的任务负荷；但是在用户对软件概念的掌握程度和迁移学习效果方面，经过PD处理的说明书则不如无PD版的说明书，需要在实际应用中根据需要注意平衡这两方面的影响。此外，研究还通过对被试操作软件和使用说明书屏幕录像分析，发现了用户对各部分PD信息的使用情况各不相同，并通过抽样访谈了解了用户对说明书的看法，最后形成对PD版说明书的优缺点总结以及说明书内容PD处理的注意事项，为行业实践提供参考。﹀
分类号：	TP3
论文总页数：	115
参考文献总数：	0
馆藏号：	017/M2016(1030)
公开日期：	2016-05-31

批评话语分析视域下的中美英智库中“一带一路”评论对比研究.肖杰

链接

题名：	批评话语分析视域下的中美英智库中“一带一路”评论对比研究
姓名：	肖杰
学号：	1301211011
专业：	软件工程
公开时间：	1年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-31
外文题名：	A Comparative Study of Commentaries on B&R in China, US and UK Think Tanks : A CDA Perspective
关键词：	批评话语分析智库评论一带一路三维分析理论
外文关键词：	Critical discourse analysis Think tank Commentaries the Belt and Road Initiative Fairclough’s Three-dimensional Model
论文摘要：	︿智库作为从事开发性研究的咨询研究机构，在国家发展中发挥着重要作用。同时，作为现代社会的重要社会组织，智库在社会系统中掌握着相当的话语能力和资源。智库评论是智库成果的重要组成部分和典型代表。“一带一路”战略是目前我国最高的国家级顶层战略，也是影响全球的重要战略。中美英都是智库建设大国，三国智库均对“一带一路”战略有相当程度的关注与研究。本文旨在研究智库评论语篇与话语实践特点，洞察智库中的中国镜像构建与其背后的意识形态，并尝试对我国智库对外传播提出建议。批评话语分析是语言学领域的一个新兴分支，它关注话语背后的权力和意识形态，经常用于跨学科领域的研究当中。Fairclough的三维分析理论、Halliday的系统功能语法都是是批评话语分析的有效方法。在本文中，作者在批评话语分析的视角下，将Fairclough的三维分析理论作为主要分析框架，结合语料库方法，对中美英三国智库中的“一带一路”评论全文本进行定量与定性研究。通过对全文本在语篇、话语实践和社会实践层面进行分析，作者比较了三国智库评论在语篇、话语实践方面的异同以及对中国镜像构建的差异，发现这些差异与其背后的意识形态相互影响、相互构建。作者总结出智库评论文章具有较强的政治性、引导性、专业性、学术性、实践性和主观性的特点，并通过剖析三国在中国镜像构建中的特点，对中国智库对外传播提出建议。在翻译方面，作者提出了强调受众本位、融入宣传目的、关注翻译细节的翻译策略。本文共有五个部分组成。首先介绍论文的写作背景，阐明论文的研究对象、内容和三个主要问题及目的，说明论文的写作结构；接着进行文献综述，依次介绍话语分析和批评话语分析的相关概念和研究情况，以及对智库进行介绍；接下来介绍理论框架，方法主要为Fairclough的三维分析理论、Halliday的系统功能语法理论和语料库的研究方法；核心部分为数据收集与分析部分，在中国的国务院发展研究中心，美国的布鲁金斯学会、英国的查塔姆学会分别抽取“一带一路”主题评论的全文本，在语篇、话语实践和社会实践三个层次进行分析。在最后，总结智库评论特点及其意识形态，对中国智库评论的对外传播和翻译策略提出探讨，并对本研究的不足进行分析。﹀
外文摘要：	︿ Functioning as consulting research institutions which focuse on national development, think tanks play a very important role in the development of our country. At the same time, they are important social organizations in modern society and have considerable discourse competence and resources. As typical representatives, commentaries of think tanks are an important part of the achievements of the think tanks. As for the topic, "the Belt and Road Initiative”（B&R） is currently China's highest national strategy and is also one of the most important global strategies. China, US and UK are all leading nations in think tanks building and they have been paying great attention to B&R. This paper aims to research the characteristics of the text and discourse practice, gain insights of the Chinese images in think tanks of three different countries and the ideologies behind it, and attempt to put forward suggestions on the international publicity of the Chinese think tanks. Critical discourse analysis（CDA） is a new branch in the field of linguistics and is concerned with the power and ideology behind the discourse. CDA is often used in interdisciplinary research. Fairclough’s Three-dimensional Model and Halliday's systemic functional grammar are all effective methods of CDA. In this paper, combining the corpus approach，the author conducts quantitative and qualitative research on all the commentaries concerned with B&R under the analytical framework of Fairclough’s Three-dimensional Model in a CDA perspective. By analyzing text, discourse practice and social practice，the author compares the three think tanks commentaries from the aspect of the discourse and China image building, finding that the differences among three think tanks and the ideologies which behind them influence each other. The author sums up the characteristics of think tank commentaries: they are political, instructive, professional, academic, practical and subjective. In addition, the author explores ideologies and power behind the commentaries of the three think tanks and puts forward the strategies of external publicity and translation strategies: emphasizing audience-based method, infusing publicity purpose and focusing on the details of translation. This paper consists of five parts. First, the author introduces the writing background, research object and content, clarifies the three main problems and explains the writing structure; then comes the literature review, the paper introduces the concepts, theories and methods of discourse analysis, critical discourse analysis and related concepts about think tanks; the next part introduces the r theoretical framework, which mainly introduces Fairclough’s Three-dimensional Model, Halliday' system function grammar and the corpus method; the core part is the data collection and analysis, the author analyzes commentaries from the State Council Development Research Center of China, the Brookings Institution of US and the Chatham House of UK from the perspectives of text, discourse practice and social practice; in the end, this paper summarizes the research achievements, puts forward the publicity and translation strategies of Chinese think tanks commentaries and analyzes the shortcomings of this study. ﹀
分类号：	H087/TP391
论文总页数：	85
参考文献总数：	70
馆藏号：	017/M2016(1005)
公开日期：	2017-05-31

基于阅读动机的面向初中生读者的科普文本编译研究.马荣荣

链接

题名：	基于阅读动机的面向初中生读者的科普文本编译研究
姓名：	马荣荣
学号：	1401210676
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-31
外文题名：	On Reading-Motivation Based Popular Science Text Trans-editing for the Junior School Students
关键词：	科普文本文本编译初中生阅读动机文本类型
论文摘要：	︿本文是基于科普文本《解释科学:现代科学的发现》的编译实践的一份研究报告。本书原文是面向普通大众的,而译文的读者受众变为了初中生。在读者受众发生变化的时候,如何编译文本来满足初中生读者的阅读需求,正是本文要探讨的内容。阅读动机是影响初中生阅读质量的重要因素,阅读动机不存在的话,那么阅读行为就无从谈起。基于前人文本编译,阅读动机,以及激发初中生读者的阅读动机的文本的相关研究,本文确定了本书的编译策略,包含激发读者内部阅读动机的编译策略, 激发读者外部阅读动机的编译策略,激发读者社会阅读动机的编译策略以及激发读者阅读效能动机的编译策略。根据《解释科学:现代科学的发现》一书的编译,本文归纳了各个编译策略下的具体的编译方法,包括情绪渲染法,添加趣味故事法,满足阅读期待法,营造设问情境法,贴近课堂学习法,补充背景信息法,增加启迪信息法,加注法,补充上下文衔接信息法,词义完整化法以及长句化简法。译文要尽可能贴近激发读者阅读动机的文本类型,使得初中生读者在阅读译文的过程中感受到阅读的乐趣。﹀
分类号：	H087/TP391
论文总页数：	199
参考文献总数：	60
馆藏号：	017/M2016(959)
公开日期：	2016-05-31

基于形成性评估的高中英语词汇评估方法设计及有效性研究.梁云辉

链接

题名：	基于形成性评估的高中英语词汇评估方法设计及有效性研究
姓名：	梁云辉
学号：	1301210771
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	中国人民大学外国语学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-31
外文题名：	Design and Effectiveness Research of High School English Vocabulary Evaluation Methods Based on Formative Evaluation
关键词：	个体性差异形成性评估高中英语二语词汇习得
论文摘要：	︿随着计算机辅助词汇学习平台的蓬勃发展，能根据学习者的词汇知识、学习能力和学习风格等相关信息因材施教的自适应学习平台已开始出现。为做到更好的因材施教，词汇学习平台需要能对学习者个体的词汇知识及学习风格等个性化信息进行快速评估的综合评估系统。形成性评估注重个体的学习过程，旨在发现学习过程中存在的问题，可以较好地契合因材施教的理念。但是，针对词汇形成性评估的研究尚存在诸多不足，如个体差异性分析较少、依赖于人为主观评价而难以保证反馈的有效性等，难以满足自适应学习平台对评估的准确化、即时化需求。因此，完善形成性评估方法及探索其在新领域中的应用成为国内外研究的新热点。本文结合第二语言词汇习得和形成性评估的研究成果，针对高中英语词汇教学和在线词汇学习系统的特点，提出并设计一种新型的英语词汇形成性评估方法。本文主要从学习者个体的四个方面进行评估：词汇知识、词汇记忆效果、词汇学习策略以及词汇学习策略与词汇记忆效果间的相关性。为更好地设计评估方法以及验证评估方法的有效性，本文对安徽省某重点高中的45名学生进行为期近3个月的高中英语词汇在线学习评估实验。学生被随机平均分配到标准组、材料组及学习组。本文首先通过分析标准组4个课时的词汇在线学习过程并结合已报道的研究，对猜词策略、例句阅读策略及反思型订正策略的评估标准和练习评分细则进行针对性制定。其次，依据材料组对目标词汇及例句的标注情况，确定学习组的学习材料。再次，利用原型设计软件对满足本文评估需求的词汇在线学习平台进行设计。最后，基于所设计的在线学习平台对学习组进行15个课时的评估实验。实验表明，本文提出的基于形成性评估的词汇评估方法能够获悉学习者个体对所学每个词汇的掌握程度、词汇学习效果、词汇保持效果及词汇学习策略倾向。本文进一步采用SPSS软件分别分析词汇学习策略与词汇学习效果及词汇保持效果间的相关性，结果显示反思型订正词汇学习策略与词汇学习效果及词汇保持效果间的相关性均不显著，猜词策略与词汇学习效果和词汇保持效果间呈现显著正相关关系，例句阅读策略与词汇学习效果和词汇保持效果间呈现显著正相关关系。最后，本文通过问卷调查和因材施教实验验证所提出的评估方法的合理性。本文提出的新型评估方法能够为自适应学习平台中基于计算机及关注学习者个体性差异的形成性评估系统的有效建立提供指导。﹀
分类号：	H313/G434
论文总页数：	76
参考文献总数：	0
馆藏号：	017/M2016(996)
公开日期：	2016-05-31

教育术语的翻译策略探究—以《“伟大”美国学校体系的死与生》一书的翻译为例.党凡钰

链接

题名：	教育术语的翻译策略探究—以《“伟大”美国学校体系的死与生》一书的翻译为例
姓名：	党凡钰
学号：	1201210549
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-31
关键词：	教育术语术语翻译翻译策略
论文摘要：	︿本次翻译实践所选书籍名为《“伟大”美国学校体系的死与生》(The Death and Life of the Great American School System)，其作者是戴安娜·拉维奇（Diane Ravitch），由基础书籍出版社于2010年出版发行。作者剖析了美国多个州和城市推行的教改，痛斥了那些盲目引进商业领域绩效考核、流程再造理论的改革，可供我国学者反思。本次翻译实践节选了本书的前六章，书中出现了很多教育术语，由于有些术语的汉译尚未规范，大大增加了翻译的难度。本文根据《“伟大”美国学校体系的死与生》一书的翻译实践，总结出教育术语的翻译原则，即在遵循一般术语翻译的准确性、简明性和规范性原则的前提下，还应该秉承重视地域差异、关注感情色彩、紧密结合语境的原则。并在此三大原则的基础上，给出了相应的翻译策略：遵循约定俗成、适度补偿信息、从上下文语境出发、对比与选择以及合理创造。在探讨中，结合具体实例对这些策略进行了分析。希望笔者的这些探究可以对其他此类书籍的翻译有所助益。由于所译书籍的学术性质，本文在对译例的探讨过程中加强了背景知识的介绍，旨在促进中西方在教育领域的学术交流。﹀
分类号：	H
论文总页数：	197
参考文献总数：	0
馆藏号：	017/M2016(999)
公开日期：	2016-05-31

美国大选电视辩论中的字幕翻译策略研究.邵巾芮

链接

题名：	美国大选电视辩论中的字幕翻译策略研究
姓名：	邵巾芮
学号：	1301210892
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
论文答辩日期：	2016-05-31
外文题名：	A Study of Subtitle Translation Strategies-A Case Study of TV Debates of the U.S. Presidential Election
关键词：	电视辩论美国大选特朗普字幕翻译
外文关键词：	TV debates presidential election translation subtitle
论文摘要：	︿无论如何看待特朗普或是其他候选人，2016年的美国大选早已集合了民众前所未有的热情，全程瞩目，而在本次大选电视辩论中，也出现了太多瞬间巨变的时刻。本文旨在探讨美国大选电视辩论中的字幕翻译策略。随着时代的发展，字幕翻译虽然越来越为人们所关注，但相关的学术研究仍然较少。本文结合2016美国大选辩论这样一个热点事件和诸多总统候选人的话语来研究字幕翻译，希望能够呈现出一些与以往不同的研究结果。本文无论是从大选文本的独特性分析，还是从字幕翻译方面，都提出了一些有价值的观点。首先，在翻译案例分析部分，提取了不同译者对同一文本的译文，进行了字幕的多译本比较。其次，引入以“弗莱石金凯可读性公式”为代表的众多评价方式，对演讲人的语言风格进行了数据分析，证明其不同的特点，具有较高的客观性。本文主要致力于探讨几个方面的问题：一是大众对于电视辩论语言翻译的误区有哪些？二是中国读者在观看美国大选辩论时的需求是什么？三是电视辩论文本在翻译时和其他类型字幕的差异性在哪？四是对电视辩论中特殊语言现象有哪些针对性的翻译策略？本文主要采用描述性研究法，为确保比较的合理性，笔者把对比译文局限于同一个语境，即2016年美国大选电视辩论以及与大选相关的脱口秀，时事评论节目。首先提出了大选辩论字幕翻译的特殊性，即不用过分受时间和空间的限制，且同一场辩论文本中的语言风格会由于竞选人的个人特点而发生较大转换，因此，译者应使用多种方法进行翻译，不拘泥于任何一种翻译理论。其次，具体到翻译过程，大众对于电视辩论语言的翻译存在较多误区，大选辩论语言虽具有一定的严肃性，但本质上并未脱离口头语的形式，具有较高文化程度的议员在发言时并未用到太多难词和政治术语，且整体所使用的句子长度也仅处于初中水平。最后，根据译文情况，笔者提出了电视辩论字幕翻译的几个标准，其中最突出的是要在异化的基础上多元尝试，将人物性格化，即翻译要整体把握人物性格，把握其语言特色，从而在译文中努力再创造这些人物性格，最终实现不仅“形”似，而且“神”似。﹀
外文摘要：	︿ Whatever you think of Donald Trump or any of the other candidates, the 2016 American presidential election has attracted unprecedented attention. This thesis will discuss subtitle translation strategies of American TV presidential debates. Subtitle translation is of great importance nowadays, but research in this field is far from being satisfactory. This thesis utilizes the 2016 American presidential debate transcripts and intends to show new perspectives of subtitle translation strategies. This thesis first points out the uniqueness of the debate text and subtitle translation. For data analysis, it uses formulas like “Flesch-Kincaid grade formula” to prove that different language proficiencies and styles exist in the debate text. For case analysis, this paper compares different translation versions originated from the same source text. This thesis focuses on the following aspects: first, what is the translation stereotype for TV presidential debate? Second, what is the primary requirement of Chinese viewers? Third, what is the difference between the translation of presidential debate subtitles and other types of subtitles? Fourth, what are the main principles and strategies for translating TV presidential debates? The conclusions of this research are that firstly, contrary to the popular assumption that political debates contain numerous big words and long sentences, readability tests proves that the readability of debate language is close to the junior high school reading level. Secondly, the uniqueness of TV debate subtitling is, compared with other types of subtitles, less restricted to time and space. At last, with reference to the translation strategies, the TV debate subtitle translation should use less domestication strategy, and attach more importance to reflecting the personality and style of the debater. ﹀
分类号：	H087/TP391
论文总页数：	54
参考文献总数：	0
馆藏号：	017/M2016(1006)
公开日期：	2016-05-31

复杂并列结构的汉译策略研究——以《优秀的绵羊》为例.陈子怡

链接

题名：	复杂并列结构的汉译策略研究——以《优秀的绵羊》为例
姓名：	陈子怡
学号：	1301210583
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	柏晓静
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-31
关键词：	复杂并列结构汉译策略
论文摘要：	︿本文基于《优秀的绵羊》（Excellent Sheep）一书的翻译实践，该书由曾经任教于耶鲁大学的教授威廉·德雷谢维奇创作完成，于2014年由自由出版社出版，描写了美国常春藤盟校的现状，以及对美国高等教育体制的批评和反思。书中涉及了大量的并列结构，包含并列词语、并列短语、并列子句和并列句子等等，其中的一些并列结构十分复杂，表达了丰富的含义和强烈的感情色彩。并列结构是由两个或者两个以上的词、短语或句子排列组合而构成的一种句法结构，其中，这些组成部分拥有相同或相似的形式，它们词性相同、功能相似、含义相关。国外研究者对于并列结构的标志和构成已经研究得十分深入，但对于更为复杂的并列结构则没有提出一套较为完整的体系。国内关于并列结构汉译的研究，主要涉及基本的翻译策略和语序的调整原则，这适用于并列项较为单一的结构，对于复杂的并列结构则不具备很强的指导性。笔者在现有研究的基础上，根据具体的翻译实践，提出了复杂并列结构的概念，并划分出三种分类方式，即嵌套式并列结构、非对称式并列结构、附加成分式并列结构。同时，在传统二分支结构图的基础上，进一步发展了适用于复杂并列结构的二分支结构图。并进一步结合书中实例，总结和提炼出针对复杂并列结构的翻译策略——针对嵌套式并列结构，保留嵌套和打破层级的方法；针对非对称式并列结构，统一地位和保持差异的方法；针对附加成分式并列结构，短句前置和长句后移的方法。此外，笔者还延伸探讨了并列结构译为非并列结构的情况，翻译过程中标点符号的处理原则，以及附加成分式并列结构在机器翻译中的语序问题。简言之，笔者试图探讨复杂并列结构汉译策略，以对并列结构的翻译研究有所补充。﹀
分类号：	H087/TP391
论文总页数：	201
参考文献总数：	0
馆藏号：	017/M2016(1032)
公开日期：	2016-05-31

2016-05-30

本地化项目管理中翻译流程优化的研究.杨恒杰

链接

题名：	本地化项目管理中翻译流程优化的研究
姓名：	杨恒杰
学号：	1201210926
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-30
外文题名：	Research on translation workflow optimization in localization project management
关键词：	本地化项目管理翻译流程优化沟通管理机器翻译
论文摘要：	︿翻译是一个非常古老的行业，而信息时代的全球化催生了一个新兴的行业 — 本地化行业。该行业因微软和IBM等各IT巨头的软件本地化的需求而兴起于上个世纪七十年代末。随着信息技术日新月异，伴生的本地化获得蓬勃发展，它已然深入影响了人们生活和工作的方方面面，更显著改变了语言服务产业的面貌。本地化行业隶属于服务业。对本地化项目的翻译流程进行科学有效地管理和优化，不仅有助于项目取得成功、规避失败，对于在满足客户需要的前提下降低成本、提高企业效益，也具有尤为重要的意义。因此，本文将本地化项目管理中翻译流程的优化作为研究对象。本文从本地化项目实践中的需求出发，在学术界和产业界人士对本地化翻译所涉及各项因素的最新研究成果的基础上，结合笔者在实习和工作期间负责的本地化项目经历，对本地化项目的翻译流程进行了分析和研究。笔者根据当前环境的需求变化并借助相关技术的最新发展，设计并提出了一个针对本地化项目管理中翻译流程优化的解决方案。其主要特点在于将机器翻译正式纳入本地化的标准工作流程，整合翻译流程各环节并对每个环节增加回馈以实现高效双向沟通，从而提高了完成本地化项目的效率并保证了项目质量。最后，本文结合行业中的实际项目案例对该解决方案的应用效果进行了验证，证明该优化解决方案在本地化项目中的实施能够切实提高本地化项目的利润率，帮助本地化公司在竞争激烈的环境下提高客户满意度和经济收益。本论文的研究成果对于本地化行业的各公司在实务中降低成本、提高经济效益具有一定的参考意义和现实意义。﹀
分类号：	TP311.52/F224.5
论文总页数：	66
参考文献总数：	0
馆藏号：	017/M2016(1039)
公开日期：	2016-05-30

基于技术接受模型的计算机辅助翻译软件用户接受行为和培训研究.荆斌

链接

题名：	基于技术接受模型的计算机辅助翻译软件用户接受行为和培训研究
姓名：	荆斌
学号：	1201210630
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-30
外文题名：	STUDY ON USER ACCEPTANCE BEHAVIOR AND TRAINING OF COMPUTER AIDED TRANSLATION SOFTWARE BASED ON TECHNOLOGY ACCEPTANCE MODEL
关键词：	计算机辅助翻译技术接受模型接受行为
外文关键词：	Computer aided translation Technology acceptance model Acceptance Behavior
论文摘要：	︿随着当前人们对翻译质量和效率的需求越来越高，计算机辅助翻译软件作为提高翻译生产力的工具，自然越来越受到译者的关注，这同时也带来了计算机辅助翻译软件培训的需求。要向用户培训计算机辅助翻译软件的操作，首先面临的问题就是此类软件能否被用户接受，哪些因素会使用户更愿意去使用此类软件。因此本文试图回答以下几个问题：哪些因素影响用户对计算机辅助翻译软件的接受和使用，从促进用户接受的角度进行计算机辅助翻译软件的培训，哪些是最佳培训策略。本文的研究首先通过文献阅读和对软件特点分析找到影响译者接受计算机辅助翻译软件的可能因素，并将这些因素分类为个人因素、组织因素和社会因素，同时将感知风险也纳入研究范围，结合技术接受模型创建计算机辅助翻译软件用户接受模型。然后通过调查问卷收集了 213 名使用此类软件译者在这些因素和软件使用方面的数据，通过相关性分析找到因素之间的联系，通过结构方程修正建立的模型最终确定影响因素。结果表明个人因素、社会因素和感知风险会影响译者接受和使用此类软件，但组织因素对其没有影响。根据这些已经确定的影响因素，在个人因素、社会因素、感知风险、有用性和易用性方面制定促进用户接受的培训策略，并通过两组培训进行对照试验以证明策略的有效性。试验结果显示这些策略可以提高用户的软件信念，促进用户使用软件。以上结论证明这些培训策略可以使用户更好的接受和使用计算机辅助翻译软件。本文为计算机辅助翻译软件用户接受行为的研究提供了依据，希望可以对未来计算机辅助翻译教学和软件培训提供方法上的启发。﹀
外文摘要：	︿ With the increasing demand for translation quality and efficiency, translators are paying more and more attention to computer-aided translation software. They use it as a tool to improve translation productivity, which has consequently brought the need for computer-aided translation software trainings. Can users accept this kind of software? What factors can make users more willing to use such software? These are the problems when training users how to use computer-aided translation software. This paper tries to answer the following questions: what factors can affect the user acceptance and use of computer-aided translation software; what are the best training strategies to teach computer-aided translation software from the perspective of promoting user acceptance. This study firstly tries to find the factors that might influence the translator acceptance of computer-aided translation software by literature reading and software characteristic analysis. These factors are classified as personal factors, organizational factors and social factors, while the perceived risk was also included in this study and combined with technology acceptance model to create the user acceptance model of computer-aided translation software. Then data is collected by a questionnaire for 213 software users. Through correlation analysis and structural equation, I find the connections between those factors and correct the established models. Then the influencing factors are identified. The results show that personal factors, social factors and perceived risk will affect the translator acceptance and use of such software yet organizational factors bring no influence. Thus, the training strategies, which can promote user acceptance, are developed basing on personal factors, social factors, perceived risk, perceived usefulness and perceived ease of use. To prove the effectiveness of the training strategies I carry out an experiment. The results indicates that these strategies can enhance the users’ software belief and software acceptance. These conclusions demonstrates that these training strategies could make users accept and use computer-aided translation software better. his paper provides a basis for the study of acceptance behaviors of computer-aided translation software users. I hope it can provide inspiration for the future computer-aided translation teaching and software training. ﹀
分类号：	TP391.2
论文总页数：	69
参考文献总数：	50
馆藏号：	017/M2016(954)
公开日期：	2016-05-30

以过程监控为核心的翻译查证能力教学研究——以CATTP平台为例.李雅慧

链接

题名：	以过程监控为核心的翻译查证能力教学研究——以CATTP平台为例
姓名：	李雅慧
学号：	1301210750
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-30
关键词：	翻译过程研究翻译教学翻译查证能力
论文摘要：	︿随着我国翻译硕士教育的发展和对翻译人才培养的重视，已经有不少高校开设计算机辅助翻译课程。翻译查证是计算机辅助翻译课程体系中的重要内容。目前翻译查证教学主要依靠课上讲解和演示，实践环节较为缺乏。而在实践环节中对学生的翻译查证作业提出精细要求，进行具体指导，并对错误进行准确分析，将会打开翻译查证教学的新天地。为改变翻译查证教学的现状，本文提出以过程监控为核心的翻译查证能力教学方法。过程监控体现在两方面，一是对教学过程进行监控，也就是将过程教学法引入翻译查证教学；二是本文的主要研究方法，即通过屏幕录像观察学生的翻译过程和查证过程，获得相关数据。以过程监控为核心的翻译查证教学分为以下几个步骤：译前背景资料查找、译中参考查证方法标注进行翻译，并填写固定模板的翻译查证笔记，译后进行小组讨论，最后是教师指导。翻译查证标注体系是本文的创新点，该体系由三部分组成，分别是：查证方法、查证错误和翻译错误。为了验证教学方法和标注体系的有效性，本研究在北京某语言类高校30名MTI一年级学生中进行了教学实验。实验组采用以过程监控为核心的翻译查证教学法进行翻译，对照组则是按照传统的翻译查证教学法进行，即没有查证方法标注也没有过程互动。两组的翻译过程均录屏。实验结束后对翻译过程视频进行分析，并通过问卷和访谈的形式继续了解学生的翻译过程和对教学方法的主观感受。实验结果发现，查证方法标注能有效提高学生对翻译查证方法的掌握，提高翻译质量和查证意识。以过程监控为核心的翻译查证教学法的教学效果更好，也更受到学生的欢迎。﹀
分类号：	H087/TP391
论文总页数：	88
参考文献总数：	0
馆藏号：	017/M2016(978)
公开日期：	2016-05-30

认知负荷理论指导下教材翻译的翻译策略.杨冰莹

链接

题名：	认知负荷理论指导下教材翻译的翻译策略
姓名：	杨冰莹
学号：	1201210922
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-30
外文题名：	TRANSLATION STRATEGIES OF TEACHING MATERIALS UNDER THE GUIDANCE OF COGNITIVE LOAD THEORY
关键词：	教材翻译认知负荷理论翻译策略
外文关键词：	Teaching material translation Cognitive Load Theory Translation strategy
论文摘要：	︿笔者翻译的《技术职业的有效沟通》（Effective Communication for the Technical Professions）属教材类书籍。笔者在探索教材翻译策略的过程中发现：教材翻译的现有研究普遍存在着与教材体裁的特殊性结合不够紧密、对教材翻译目的考虑不够全面等问题。因此，对于教材这种特定体裁的翻译，有必要找出既有针对性又不失一定普适性且较为系统的翻译策略。针对教材翻译目的的特殊性，笔者发现教材体裁与认知负荷（cognitive load）密切相关。因此，笔者结合认知负荷理论（Cognitive Load Theory），从译文认知负荷这一新视角讨论教材翻译策略。本研究先从研究对象、翻译对认知负荷的影响以及教材的特殊性三方面对认知负荷理论指导教材翻译的可行性进行了深入分析。论证了可行性后，笔者结合翻译教材的目的与本项目的特点，确立了认知负荷理论指导教材翻译的指导思想，并提出了以下翻译策略：1）减少“外在”认知负荷，即因教材翻译不当产生的额外的认知负担。减少外在认知负荷的翻译策略有：减少句子的元素，降低句子的元素互动性；明确代词的指代内容，同义词简化为同个词；通过融合的方式进行文化背景补充。2）促进“有效”认知负荷的提高。我们将读者有效学习时所产生的认知负荷称之为“有效”认知负荷。读者投入学习的精力越多，产生的有效认知负荷就越高。在翻译中，促进有效认知负荷的提高则可以通过采用非强迫语言风格提高学生的学习动机，保留提示重难点的文章标记，保留自由目标练习的有效性以辅助读者巩固知识等策略来实现。在讨论翻译策略时，笔者列举了具体翻译范例进行分析论证，以说明策略的有效性。此外，因减少外在认知负荷的翻译策略具有通过实验加以验证的可行性，笔者设计并展开了对照实验。实验结果进一步说明了本文所提出的减少外在认知负荷的翻译策略的有效性。本研究为教材翻译研究开拓了新视角，从认知负荷的角度讨论教材翻译，并提出了可行的翻译策略；还通过翻译实践和对照实验的方式，验证了翻译策略的有效性。希望本研究能为指导翻译实践提供一些帮助，并引起广大学者对认知负荷理论与教材翻译研究的结合进行更深入的思考。﹀
外文摘要：	︿ The book of my translation project, Effective Communication for the Technical Professions, is meant to be used as a textbook. Some common issues in the field of teaching material translation studies were identified throughout conducting the literature research about the translation strategies for textbooks: some studies were not significantly relevant to the features of teaching materials in particular, and some didn’t consider the purpose of teaching material translation thoroughly. Coming up with systematic translation strategies with appropriate pertinence and universality, for the translation of this special type of books - teaching materials – could contribute to solve these issues. From the perspective of the purpose of teaching material translation, it was noticed that teaching materials and cognitive load were closely related. Therefore, the discussion on teaching material translation was developed in this paper by taking into account readers’ cognitive load, with the guidance of Cognitive Load Theory (CLT). Firstly, the feasibility for CLT to serve as a translation strategy guideline for teaching materials was discussed, considering the research objects, the impact that the translation had on readers’ cognitive load and the characteristics of teaching materials. Then, by combining the purpose of teaching material translation with the specific features of this translation project, the translation guideline was established and the following translation strategies were developed: 1.The first strategy is to reduce the “extraneous” cognitive load, which is the extra cognitive burden caused by inappropriate translation of teaching materials. It can be done by reducing the number of elements in one sentence and decreasing the element interactivity of the sentence, clarifying the reference and replacing the synonyms with one word, and finally merging background information explanation into the main body text. 2. The second strategy is to enhance the “germane” cognitive load, which is produced when the reader puts effort to study. The more focused the reader is on his study, the higher his germane cognitive load is. It can be achieved by enhancing the reader’s study motivation with the use of non-control language style, keeping the text organizational signals that point out the important or difficult parts, and by retaining the goal-free practice effects to help readers consolidate the newly-learnt knowledge. Some examples were given in complement to the strategies in order to illustrate their benefits. In addition, an experiment was designed and implemented to prove the validity of the strategies that result in reducing the extraneous cognitive load. The results of the experiment showed that the translation strategies could indeed effectively reduce this type of cognitive load. The study is conducted from a new perspective of teaching material translation research: the cognitive load aspects in the teaching material translations are considered, based on which some feasible translation strategies are developed, and the translation strategies are proved to be validated through both the translation practice and the experiment. Hopefully, this study will assist translators with translation practice and lead to deeper thinking from the scholars about the combination of Cognitive Load Theory and teaching material translation. ﹀
分类号：	H087/TP391
论文总页数：	216
参考文献总数：	0
馆藏号：	017/M2016(1011)
公开日期：	2016-05-30

语块在翻译教学中的应用——基于眼动追踪的实验研究.李静雅

链接

题名：	语块在翻译教学中的应用——基于眼动追踪的实验研究
姓名：	李静雅
学号：	1401210603
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-30
外文题名：	The Application of Lexical Chunks in Translation Teaching —— An Experimental Study Based on Eye-Tracking
关键词：	语块翻译教学翻译认知教学方法眼动追踪
外文关键词：	Chunk Translation Teaching Translation Cognition Teaching Method Eye-Tracking
论文摘要：	︿近年来，语块理论成为了翻译教学中的热点问题，国内外许多研究表明，在翻译教学中应用语块理论对学生翻译能力的提高有显著的效果。本文基于翻译认知理论，将翻译认知理论研究的线上与线下方法相结合，利用眼动追踪技术，研究了翻译教学中不同的语块教学方法和内容对学生的学习效果和认知加工过程的影响，以探索最佳的语块教学方法和内容。本论文总结了目前国内外翻译教学现状、语块理论研究现状、以及翻译认知研究现状，对已有眼动实验研究进行了细致地总结和分析。在研究问题上，本文主要围绕语块的“记忆方法”、“教学顺序”、“教学材料”三大内容展开。记忆方法，包括传统的背诵方法和学习例句的语境方法，假设通过学习例句的语境方法可以更好地提高语块学习效果，在翻译过程中有效降低认知负担；教学顺序，即语境学习法中双语例句的出现顺序，包括一般的同时出现的方式，和先出现英语例句再出现中文例句的方式，探究何种顺序可以更好地促进语块学习效果，降低翻译过程中的认知负担；教学材料，指语境学习法中的例句数量，实验研究了1个例句、2个例句和3个例句中哪一种对提高语块学习效果和降低翻译认知负担的帮助最大。此外，在研究语块教学方法的同时，本文还额外探究了在翻译过程中，对包含多种译法的语块进行提示是否可以有效提高翻译效率的问题，设计了不提示、提示比喻义、提示比喻义和字面义的三种提示方法，为机器翻译中对语块的提示功能作出启示。围绕研究问题，本文共设计并完成了四个基于翻译认知理论的眼动实验。通过分析实验中被试的翻译用时、翻译速度、翻译正确率、停顿次数等非眼动数据，以及被试在翻译每个单句时的注视次数、注视时间，在每个语块兴趣区中的注视次数、注视时间等眼动数据，最终得出了以下结论：语块的教学方法中，使用语境教学法可以更有效降低翻译过程中的认知负担；在语境教学法中，先学习英语例句，再学习中文例句，可以更有效降低翻译过程中的认知负担；对于例句数量，应依据学生的实际需求进行调整，一般提供3个例句可以更好地降低翻译过程中的认知负担和翻译语块时的认知负担。本文从全新的角度对翻译教学、语块教学和翻译认知研究展开讨论，对提高高校学生的翻译水平有重要的启示作用。﹀
外文摘要：	︿ In recent years, the chunk theory has become a hot research topic in translation teaching. It has been proven that the use of chunk theory in translation teaching has significant effect on the improvement of translation ability of university students. This paper studies about the effects of different chunk teaching methods and materials on students’ cognitive process in translation, using the offline and online methods of translation cognitive theory and eye-tracking technology as research technique. The goal of this study is to find out the best chunk teaching method for university students. This paper first analyses the current situation of translation teaching, previous studies about chunk theory and translation cognitive theory, as well as the existing eye-tracking experiments at home and abroad. Based on these studies, this paper designs three main experiments on chunk teaching, centering on memorization methods, teaching orders and teaching material of chunks. Memorization methods include the traditional method and the context method. The hypothesis is that the context method causes lower cognitive load in translation process. Teaching orders include teaching bilingual example sentences at the same time and first teaching English example sentence then Chinese example sentence. The hypothesis is that the latter causes lower cognitive load in translation process. Teaching material is about the numbers of example sentences, which are 1, 2, and 3, to find out the number with the lowest cognitive load in translation process. Besides, this paper also designs an extra experiment to study chunks with various translation ways in machine translation. After analyzing the basic data such as translation time, speed, correction rate, pause times etc. and the eye movement data such as fixation count, fixation duration etc. collected from the research experiments, this thesis comes into a conclusion as follow: For chunk teaching methods, the context method is more effective in reducing cognitive load, and the method of first English example sentence then Chinese example sentence is more effective in reducing the cognitive load in translation process. As for the numbers of sentences, it depends on the needs of the students, normally 3 example sentences is the best choice. To sum up, all these key conclusions above will give a new angle of view of translation teaching, and contribute to the enhancement of translation ability of university students. ﹀
分类号：	H087/TP391
论文总页数：	88
参考文献总数：	0
馆藏号：	017/M2016(1014)
公开日期：	2016-05-30

英汉翻译中习语的处理策略研究.张亚琦

链接

题名：	英汉翻译中习语的处理策略研究
姓名：	张亚琦
学号：	1201211013
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
论文答辩日期：	2016-05-30
关键词：	习语标记理论翻译
论文摘要：	︿ Conform: Exposing the Truth about Common Core and Public Education（《困兽之斗》）是一本论说型书籍，剖析美国当今教育体系，揭示了许多被公众看作是稀松平常的事物背后的政治意图和阴谋。叙述和评论兼备，语言生动活泼、幽默老辣、引人入胜。其中，习语的使用是本书的一大特色，让人在思辨的同时，又获得语言的享受。习语的翻译不是一个新鲜话题，但是在数量众多的文献中，大同小异者居多，而且多数论述简单，论证不足。归化异化、功能对等、文化视角、意象视角的讨论占据绝大多数，还从未有学者从标记理论的视角研究习语的翻译。笔者通过对标记理论和习语的研究发现，二者就有天然的关联性：习语是带有人为标记的语言形式。这些标记不是随意存在的，而是出于特定的目的。人们使用习语也正是因为看中了习语的标记效果。侯国金提出的语用标记等效翻译原则（PMEP）是标记理论和功能对等理论的二级原则，为习语翻译指出了清晰的翻译目标。PMEP对翻译有较强的指导意义，但是对多标记多语效的复杂标记现象的讨论欠缺，本文对其进行了补充。在翻译原则方面，结合《困兽之斗》一书的文体特征，笔者提出了“简洁自然置于首位，标记等效尽量实现”的翻译原则。之后总结了习语的翻译难点，并给出解决思路。由于不同类型的习语在翻译中遇到的困难不尽相同，笔者对习语进行了分类讨论，并给出标记全部平移、标记平移+标记添加、标记平移+标记舍弃、标记平移+标记舍弃+标记替换四类翻译方法，其中属于标记平移+添加的仿拟策略、属于标记全部平移的歇后语创造策略和属于标记替换的活用策略是亮点。﹀
分类号：	H315.9
论文总页数：	193
参考文献总数：	0
馆藏号：	017/M2016(1020)
公开日期：	2016-05-30

以游戏机制为核心的教育游戏在英语语法教学中的设计与实现.李想

链接

题名：	以游戏机制为核心的教育游戏在英语语法教学中的设计与实现
姓名：	李想
学号：	1201210669
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-30
关键词：	教育游戏英语语法游戏设计
论文摘要：	︿随着“游戏的一代”的成长，游戏在教育领域中的应用受到越来越多的关注。游戏是人类的天性，是在幼童成长之初所接受的最早的教育形式，是孩子了解和认知世界的过程中不可或缺的一部分。在我国的教育过程中，固守传统、片面的应试教育一直占主导地位，游戏在教学中的重要作用逐渐被忽略。现今教育游戏的发展及应用还原了教育的愉悦性，更加符合青少年发展时期活泼、好动的天性。随着这一领域相关研究和应用不断增多，人们逐渐了解到教育游戏在外语教学中的作用同样不可小觑。但是，英语教育游戏如何设计才能具有更好的教学效果？桌面游戏对第二语言的学习有什么促进作用？现有的英语教育游戏还存在哪些缺点与不足？又如何进行改进呢？应如何结合教学目标设计一款桌面教育游戏呢？又应如何对使用教育游戏的学习效率进行验证和分析呢？这些问题都值得深入探讨和研究。本文基于初中英语语法教学内容设计了一款桌面教育游戏，以行为主义学习理论、建构主义学习理论以及动机理论为指导，提出了以游戏机制为核心的教育游戏设计理念。本文首先对与教育游戏有关的概念进行了阐述和梳理，介绍了本文所设计的英语语法桌面教育游戏的主要理论基础，对第二语言语法教学进行了回顾。之后，本文从游戏设计的概念形成阶段到纸面原型阶段介绍了以游戏测试和迭代为核心思想的游戏设计理论，并基于外语教学与研究出版社出版的《英语八年级下册》（新标准）中的不规则动词变化形式和六种时态语法的相关知识点，使用多次迭代和反复测试的方法设计出了一套以游戏机制为核心的英语语法桌面教育游戏。以游戏机制为核心，是指将教学内容设计为游戏规则的一部分，使得游戏更具可玩性。之后本文从静态平衡和动态平衡两个方面对这套英语语法桌面教育游戏的平衡性进行了计算。笔者通过两次教学实验，论证本文设计的以游戏机制为核心的英语语法桌面教育游戏的可行性。第一次实验论证本文设计的英语语法桌面教育游戏的可玩度和学生对游戏的接受度，实验历时两周，以20名学生为研究对象，证明了游戏的玩法较易被学生接受，学生普遍认同游戏的乐趣性。第二次实验论证了以游戏机制为核心的桌面教育游戏的有效性，实验历时三周，以45名学生为研究对象。实验组采用本文设计的桌面教育游戏进行学习，对照组1不采用任何教学手段，对照组2采用传统方式对英语语法知识进行记诵。通过前测、后测以及问卷调查和访谈的形式收集数据，从定量和定性的角度分析使用本论文设计的英语语法桌面教育游戏的有效性。经过SPSS数据分析，本文得出以下三个实验结论：1.本文设计的以游戏机制为核心的桌面教育游戏对提高学生的英语语法水平很有成效。2.以游戏机制为核心的桌面教育游戏令学习者感到游戏的乐趣并受到学习者的喜爱。 3.以游戏机制为核心的桌面教育游戏提高了学习者对英语语法学习的兴趣。﹀
分类号：	H3/G8
论文总页数：	106
参考文献总数：	0
馆藏号：	017/M2016(1036)
公开日期：	2016-05-30

粉丝文化对翻译的影响—— 以“欧美圈”为例.尹玉珺

链接

题名：	粉丝文化对翻译的影响—— 以“欧美圈”为例
姓名：	尹玉珺
学号：	1201210952
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-30
外文题名：	The Influences of Fan Culture on Translation: A Case Study of “Western” Fandom in China
关键词：	粉丝粉都英汉翻译欧美圈
外文关键词：	Fans Fandom Translation Study Western Fandom in China
论文摘要：	︿借助文化产业发展、宽松的文化环境、便利的互联网等因素，过去的十年里，粉丝文化在国内迅速发展，越来越多的国外作品通过翻译介绍到了中国，吸引了大量粉丝，粉丝也对相关作品的翻译提出不同的需求。在本论文中，笔者将以具有代表性的“欧美圈”粉都为案例，论述粉丝群体和粉丝文化对相关作品的英汉翻译产生的影响。笔者在德赛都-费斯克-詹金斯的粉丝文化研究理论的基础上，结合以“欧美圈”粉都喜爱的美国电影字幕和漫画为主的相关作品翻译及受众批评分析，和对该粉都的粉丝抽样调查结果，开展文献研究和调查研究，探索我国粉丝群体和翻译作品之间的关系。本论文通过文献研究和案例研究，提出粉丝型受众通过“过度”消费和生产翻译两种行为同时直接和间接地影响翻译作品。笔者选取了“欧美圈”粉丝对过去10年中上映的美国电影字幕翻译的批评意见并进行详细分析，发现粉丝对翻译作品的评价标准是本真性，对翻译作品的期待主要表现在专有信息、互文信息、原文本风格、一致性等方面。笔者在对1355份有效抽样调查问卷分析过程中，发现粉丝交谈是最重要的影响途径，粉丝的口碑会干预受众对翻译作品的选择和购买，导致受众对粉丝译者和非粉丝译者产生偏见。﹀
分类号：	H315.9
论文总页数：	79
参考文献总数：	49
馆藏号：	017/M2016(982)
公开日期：	2016-05-30

交互式多媒体技术教程可用性研究.赵寻

链接

题名：	交互式多媒体技术教程可用性研究
姓名：	赵寻
学号：	1301211117
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	对外经贸大学
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-30
外文题名：	The Usability of Interactive Multimedia Technical Instructions
关键词：	可用性交互技术教程技术传播
论文摘要：	︿随着科技的发展，媒介的演变呈整合化发展趋势，技术教程的交付类型也变得多媒体化。为获得更好的用户体验，很多研究人员提倡引入交互手段对多媒体技术教程中的媒介和信息进行控制和选择，形成交互式多媒体技术教程。交互的运用以及不同交互程度对技术教程的可用性影响现在还没有一个统一的答案，所以有必要对交互式多媒体技术教程可用性进行探索和研究。本文探究了非交互式多媒体技术教程、低交互性多媒体技术教程、高交互性多媒体技术教程的可用性。新手用户分别在三组多媒体技术教程的指导下完成可用性测试任务，并在任务完成之后接受短期保持性测试、准确性测试、迁移测试和问卷调查。在实验数据的基础上，通过任务完成率、出错率、效率、短期保持性、准确性、迁移性、满意度等方面对三种多媒体教程的可用性水平进行评估。研究表明，总体而言高交互性多媒体技术教程的可用性最高、低交互性多媒体技术教程的可用性其次、非交互式多媒体技术教程的可用性最低。交互的引入能够提高用户完成任务的有效性、效率和用户的满意度，交互程度更高的交互式多媒体技术教程用户任务的有效性、效率以及用户满意度也相对更高。但是技术信息的准确性和迁移性并不与交互的程度直接相关，而是与特定的交互维度有关。当教程中包含了鼓励用户进行认知的主动加工的交互维度时，技术信息的准确性和迁移性相对较高。而就算教程的交互程度相对较高，但并未包含特定交互维度，则在信息准确性和迁移性方面与非交互式多媒体技术教程的差异也并不显著。本研究有利于技术传播工作者更好地交互式多媒体技术教程的优势和不足，对研究者进一步探究交互在技术教程中的应用奠定了基础。﹀
分类号：	H087/TP391
论文总页数：	82
参考文献总数：	0
馆藏号：	017/M2016(814)
公开日期：	2016-05-30

英国文化词汇的汉译策略——以《不畏艰险》为例.石晨

链接

题名：	英国文化词汇的汉译策略——以《不畏艰险》为例
姓名：	石晨
学号：	1301210898
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-30
外文题名：	Translation Strategies for British Cultural Words: A Case Study of Through Thick and Thin
关键词：	英国文化词汇汉译策略
外文关键词：	British Cultural words Translation strategies
论文摘要：	︿本翻译项目所译书籍题为Through Thick and Thin，该书是知名英国华裔时尚顾问古克·温（Gok Wan）的自传，于2010年出版。作者回忆了儿时的成长经历，介绍了自己的成功之路。书中包含大量英国文化词汇，并且涉及范围广泛。在翻译中，该类词汇很具有挑战性。本文在Through Thick and Thin一书的翻译实践基础上，总结出两条文化词汇的通用翻译原则，即以表意准确为前提，以提供等量信息为目标。笔者在研究过程中，将书籍中的英国文化词汇分为教育、饮食和时尚三类，针对每类文化词汇提出相应的翻译策略。教育词汇的汉译策略主要包括发掘共性，借鉴汉语习惯表达；厘清差异，文化空缺模糊化；联系语境，辨别一词多义。饮食词汇可分为英国本土食品名称、粤菜英文名以及粤音菜名和餐馆名三类，对应的翻译方法分别为：分析做法，寻求名称对等；定位菜系，甄选特色用语；兼顾音义，借助粤音工具书推测验证。翻译时尚词汇时应以图为据，释义描述，并做到追根溯源，传递英国亚文化信息。本文不仅研究英国本土特色词汇，而且将少量融入英国文化的外来词汇作为研究对象之一，如中餐英文名和粤音词汇。目前很多英国文化词汇还缺少准确译文。译者在翻译时首先应准确定位词汇文化内涵，深度搜索并分析相关信息，然后借助文化词典，比对同类书籍和影视作品中的译文，发挥自身双语优势，探究最合适的翻译策略。本文旨在研究英国文化词汇的汉译策略，以提高此类词汇翻译的准确性和得体性，同时唤起相关领域对英国文化词汇翻译的重视，让中国读者了解更多英国文化，促进中英文化的沟通与交流。﹀
外文摘要：	︿ This translation project is based on the book Through Thick and Thin, an autobiography of a well-known British-born Chinese fashion consultant Gok Wan. Published in 2010, the book is about Gok's childhood stories and his odyssey to success. It abounds with various British cultural words which pose great challenges to the whole translation process. Based on the translation project, two principles for translating cultural words are summarized, that is, precise cultural interpretation and equal information for target language readers (as source language readers may have). The British cultural words in this book are generally divided into three categories – education, food, and fashion. Then the corresponding strategies are proposed respectively. When translating education-related words, the translator may consider hypernyms and Chinese idiomatic expresstions, and identify polysemous words by use of context. As for British food names, recipe analysis and identification of equal Chinese food are recommended. Back-translation of Chinese dish names requires choosing words according to specific cuisine, and identifying Jyutping by combining pronunciation and meaning. Fashion-related words can be better understood with pictures, and interpretation of their subcultural connotations is also suggested. Currently, there is a lack of precise translation for many British cultural words. In translation, a clear understanding of these words comes first. Then with reference to cultural dictionaries and existing translations, a proper translation strategy can be worked out based on a deep understanding of both cultures. This research is aimed to explore some strategies for translating British cultural words, in the hope of improving the precision and appropriateness of cultural translation as well as promoting Anglo-Chinese cultural communication. ﹀
分类号：	H087/TP391
论文总页数：	203
参考文献总数：	45
馆藏号：	017/M2016(943)
公开日期：	2016-05-30

TED演讲中说服力的语言层面分析.李琳

链接

题名：	TED演讲中说服力的语言层面分析
姓名：	李琳
学号：	1201210652
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	对外经济贸易大学
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-30
关键词：	TED演讲说服力语言层面
论文摘要：	︿本文选取TED演讲为研究对象，从语言层面研究TED演讲中说服力的实现方式，以期为公共演讲中更好地实现演讲目的和说服效果提供参考。本研究采用了文献研究法、语料库研究法、跨学科研究法，以及定量分析与定性分析等研究方法。本文首先进行了TED演讲和演讲说服力的文献研究，然后根据TED评价系统的用户评价数据自建了四个语料库：说服语料库、非说服语料库、负面评价语料库，以及除说服力外的其他正面评价语料库。从话题、演讲者、演讲时长的角度和语料库角度出发，本文分析了各类语料的特征，研究各因素与演讲说服力之间的关系。之后，对67篇说服语料中说服力的表现方式进行了具体分析，包括修辞疑问句、具体数据的使用，以及说服语篇与负面评价语篇中同主题演讲的内容对比，结合实例来分析这些因素在实现说服效果方面发挥的作用。通过对TED演讲说服语篇的研读，笔者提出了三类典型的有助于增强说服力的修辞疑问句：附有深刻即时回答的普通问题、拓展思维的连续问题和引人深思的演讲结束问题。同时，根据数据描述对象的不同，笔者将其中有助于实现说服效果的数据归纳为以下三类：描述事实或统计结果、描述个人经历或故事，以及描述对未来的思考或预测。研究发现，TED演讲的说服力与演讲者的性别、国籍、演讲语篇的平均词长、平均句长和词汇密度等关系不大，修辞疑问句、数据的恰当使用和演讲内容的有效组织有利于增强说服效果。此外，TED演讲的说服力与演讲话题存在一定关联，说服语料中个别词的使用频率高于另外三类语料。TED演讲的受众遍及全球，其说服演讲中说服力的呈现方式对其它公共演讲具有一定的借鉴意义。本文研究主要集中在语言层面，目的在于为英语演讲者在类似的公共演讲中提供参考，为相关学科的进一步研究奠定基础。﹀
分类号：	H082
论文总页数：	85
参考文献总数：	39
馆藏号：	017/M2016(958)
公开日期：	2016-05-30

《戴尔模式》一书中情感词汇的翻译策略.高月

链接

题名：	《戴尔模式》一书中情感词汇的翻译策略
姓名：	高月
学号：	1301210629
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-30
关键词：	情感词汇褒贬翻译策略
论文摘要：	︿本文基于《戴尔模式》（How Dell Does It）一书的翻译实践，该书主要介绍的是戴尔的成功商业模式，比如直销、零库存、客户至上、供应商管理等。书中涉及多个利益相关的主体，各主体间情感、立场鲜明；作者对戴尔持积极认可的态度，而且大量引用了杂志、访谈、新闻等态度鲜明的第三方观点。所以，与作者情感共鸣，准确再现原文的情感至关重要。而词汇是语言的基础，所以本文选取的研究对象是情感词汇，即表达对客观事物的感受、态度或褒贬评价的词汇。情感词汇的翻译受到多种因素影响，本文从语境、作者立场和文化三个角度进行了深入探讨。首先结合MPQA情感词典的局限性，说明了语境如何影响词汇的情感倾向和情感强度。然后结合《戴尔模式》一书的翻译实践，探讨了书中各主体间的态度立场对褒贬词义选择的影响。最后从文化角度将情感词汇细分为情感等值词、情感部分等值词、情感不等值词和情感空缺词，并结合本书中多次出现的“underdog”一词对情感部分等值词进行了更详细地阐释。针对情感词汇的翻译，笔者结合本次翻译实践总结出语义翻译、阐释引申含义、加注注释情感、增词补充语义等策略，并结合此次翻译项目中的实例加以说明。同时，笔者充分利用了平行文本，COCA语料库，多种版本的字典以及搜索引擎等翻译辅助工具，从而更好地挖掘词汇的情感内涵，以期为情感词汇的翻译提供借鉴。﹀
分类号：	H087/TP391
论文总页数：	216
参考文献总数：	0
馆藏号：	017/M2016(899)
公开日期：	2016-05-30

基于自适应学习模式的高中英语听力教学研究.宋凌云

链接

题名：	基于自适应学习模式的高中英语听力教学研究
姓名：	宋凌云
学号：	1301210907
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-30
外文题名：	Applied Research on Listening Teaching of Senior High Based on Adaptive Learning
关键词：	自适应学习高中听力教学个性化学习
外文关键词：	adaptive learning listening teaching of senior high individualized learning
论文摘要：	︿听力是第二语言习得最重要的内容之一，虽然不同学生之间听力水平和能力具有较大差别，但在现有的听力课堂上，所有学生的听力材料和教学方法完全相同，因此现有的听力教学无法满足每个学生的需求。随着互联网教育的发展，自适应学习的教学思想在一定程度上能够满足每个学生个性化的学习需求。它能根据学生的相关信息和既定的听力材料推荐规则，为不同听力水平的学生推荐合适的听力训练材料，实现英语听力教学的因材施教。本研究以高中英语听力教学为切入点，根据自适应学习和听力教学的相关研究成果，尝试将自适应学习方法应用到高中英语听力教学中，提高高中英语听力教学效率。本研究从易读性、词汇覆盖率、语速3个方面分析了256篇听力音频，从听力成绩、词汇量、语音水平和语音记忆4个方面衡量了学生的听力水平，结合多条自适应规则，为不同听力水平的学生提供了个性化的听力学习资料。为了验证本文设计的高中英语听力教学设计的有效性，本研究在北京某公办高中对28名高中一年级学生进行了为期8周的听力教学实验。对照组采用传统的高中英语听力训练方法进行听力训练，所有被试的听力学习进度、学习任务、情感策略均完全相同；实验组采用本文设计的自适应高中英语听力教学方法进行听力训练，每个被试的听力学习材料均根据学生的相关信息和自适应规则按需提供。教学实验结束后，本研究对两组被试的整体听力水平、信息辨认能力、逻辑思维能力、信息转述能力进行了后测，后测结果表明：本研究设计的高中英语听力教学方法在整体上比传统非自适应的听力教学方法更有效，但这种有效更多体现在学生的信息转述能力、信息辨认能力上，尤其是信息转述能力，而逻辑思维能力上没有显著差异。﹀
外文摘要：	︿ Listening is one of the most important aspects for second language acquisition. Although there are huge differences among students' listening ability, at present all students are taught by the same teaching method with the same listening material. Obviously the current teaching mode can not meet each student's requirement. With the development of Internet education, the adaptive teaching method is able to meet each student's individualized learning needs to some extent. By following this new method, students are provided with different listening materials according to their personal information and listening material recommendation rules. Focusing on the listening teaching of senior high school, this study attempts to apply the adaptive learning method to the listening teaching of senior high school in order to improve teaching efficiency. This study first analyses 256 listening materials from the perspective of readability, vocabulary coverage rate and speed, then analyses students' listening level from listening score, listening vocabulary, speech level and phonetic memory, after that, proposes some adaptive rules to recommend listening materials, and finally provides students with personalized listening materials. In order to verify the effectiveness of the proposed teaching method, an experiment was carried out in a Beijing public high school with 28 high school students taking part in for a period of eight weeks. The control group adopted the traditional teaching method by using same teaching materials, task and affective strategy while the experimental group adopted the adaptive listening teaching methods proposed in this study. That is, each individual of the experimental group was provided with different listening materials based on the analysis and the adaptive recommdation rules. After the experiment, post test was carried out focusing on the overall listening level, information identification ability, logical thinking ability and information relaying ability. The post-test results showed that: the listening teaching method designed in this thesis is more effective than the tradinional non-adaptive one in general. There is an obvious difference in information relaying ability and information identification ability, especially the information relaying ability. Yet there is no obvious difference in logical thinking ability. ﹀
分类号：	H087/TP391
论文总页数：	84
参考文献总数：	85
馆藏号：	017/M2016(940)
公开日期：	2016-05-30

基于参与主体视角的IT图书翻译出版活动研究——以图灵教育《洞悉数据》的翻译出版为例.刘云涛

链接

题名：	基于参与主体视角的IT图书翻译出版活动研究——以图灵教育《洞悉数据》的翻译出版为例
姓名：	刘云涛
学号：	1401210659
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-30
关键词：	IT图书翻译出版读者译者编辑
论文摘要：	︿近年来，国内引进版IT图书翻译出版活动随着互联网行业的迅速发展而日渐兴起。然而，笔者在与图灵公司合作翻译出版《洞悉数据》一书的过程中发现，引进版IT图书的翻译出版活动的各个参与主体还存在不同的问题，导致引进版IT图书翻译出版活动效率不高，图书翻译质量参差不齐。本文首先对《洞悉数据》一书的翻译出版过程进行了总结，介绍了项目相关背景和IT图书翻译出版流程；其次，通过文献分析方法对图书翻译出版的国内外研究进行了总结，指出图书翻译出版实务活动研究的特殊性；随后，通过问卷调查和访谈，分别从读者、编辑和译者三个参与主体的角度对IT图书翻译出版流程进行分析，发现了各自存在的问题：在读者层面，本文从阅读诉求和译者偏好两个方面总结出读者的合理诉求和不合理诉求；在编辑层面，分析了目前编辑存在的主客观问题以及话语权力的问题；在译者层面，分别对语言背景译者和技术背景译者分别进行分析，对译者存在的问题进行了分析研究。基于上述研究，本文发现读者、编辑和译者在IT图书翻译出版活动中相互制约和相互影响的关系，其中编辑则是整个翻译出版活动的纽带，连接读者和译者的桥梁。最后，本文分别从读者诉求、编辑话语权力以及译者流失等方面提出相应策略。本文从读者、编辑和译者三个方面重新审视IT图书的翻译出版活动，从宏观角度研究图书翻译出版活动。本文的研究结论能够促进读者、编辑和译者在翻译出版过程中良好协作，提升引进版IT图书的翻译质量和出版效率。﹀
分类号：	H087/TP391
论文总页数：	205
参考文献总数：	0
馆藏号：	017/M2016(979)
公开日期：	2016-05-30

模糊匹配句段与译者认知努力相关性的研究.张能

链接

题名：	模糊匹配句段与译者认知努力相关性的研究
姓名：	张能
学号：	1301211088
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	柏晓静
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2016-05-30
外文题名：	Research on Relations between Fuzzy Matches and Cognitive Effort
关键词：	翻译记忆模糊匹配认知努力
外文关键词：	translation memory cognitive effort fuzzy match
论文摘要：	︿语言服务行业丰润变革与快速发展催生了翻译技术的成熟，尤其是计算机辅助翻译技术和机器翻译技术。翻译记忆作为计算机辅助翻译的核心技术，将模糊匹配句段输出供译者翻译参考，以提高翻译效率和质量。在这样的翻译过程中，译者与计算机的交互过程和认知过程也相应地发生了改变。此外，翻译记忆也改变了语言服务行业的翻译费率计算机制，即会对模糊匹配句段给予部分折扣。本文以匹配率不同的模糊匹配句段为研究对象，旨在探索模糊匹配率与译者认知努力间的相关性，从而帮助译者更好地利用翻译记忆、更好地预估翻译时间。此外，本研究为翻译产业计费机制的建立和模糊匹配相似度算法的改进提供了一些研究基础。本文将模糊匹配区间梯度设置为无匹配、55%-64%、65%-74%、75%-84%、85%-94%、95%-99%、100%和机器翻译译文八种模糊匹配类型展开对比研究，以击键记录、录屏和回溯性访谈线上线下研究相结合的研究手段，从任务时长、击键次数、停顿次数、停顿时长和意识负担几个维度研究翻译认知研究中最为合适的停顿阈值以及各模糊匹配类型下译者在翻译过程中付出的认知努力。通过各方面数据的统计分析，实验发现，时长1200ms最适合作为本实验中翻译认知研究的停顿阈值。在匹配率为100%-55%的模糊匹配区间，译者的认知努力总体上随着匹配率的降低而增加，但呈非线性关系：100%-95%模糊匹配区间，译者的认知努力急剧增加（7.5%增加为15.8%）；95%-75%的继续较明显增加，但增幅越来越小，且均小于100%-95%模糊匹配区间；75%-55%的继续增加，但较不明显，增幅小于95%-75%模糊匹配区间；55%-0%的则稍有增加，逐渐趋于稳定；此外，医学领域文本中，Google Translate机器翻译后编辑付出的认知努力小于人工翻译、大于64%-55%模糊匹配区间的认知努力。﹀
外文摘要：	︿ Rapid development in language service industry brings in advanced translation technologies, especially Computer-aided Translation (CAT) and machine translation (MT). The core technology of CAT, Translation Memory (TM), generates fuzzy matches for translators to refer to. The application of TM is proved to advance translation quality and improve efficiency. In such translation process, interaction between translators and computers as well as translators’ cognitive process accordingly change. Moreover, TM transforms translation pricing model. The new pricing model would charge a lesser rate of fuzzy match segments. This thesis studies fuzzy match segments at different match levels to explore the relations between fuzzy matches and cognitive effort. This research not only has implications for how to make the best of TM but also provides theoretical guidance for the rating compensation model in Chinese translation industry. This thesis comparatively studies the fuzzy match types at intervals of 0%, 55%-64%, 65%-74%, 75%-84%, 85%-94%, 95%-99%, 100% and MT outputs. Online and offline research methods such as keystroke logging, video recording and retrospective dialogue are employed to collect the data of task time, number of keystroke times, number of pause times, pause time and perception of effort to study the most suitable pause threshold in translation cognitive research and translators’ cognitive effort exerted in translation process. Data statistical analysis shows that the most suitable pause threshold for this translation cognitive study is 1200ms. Generally, translators’ cognitive effort increases in a non-linear relation as fuzzy match value decreases. In 100%-95% fuzzy match interval, cognitive effort raises sharply from 7.5% to 15.8% when fuzzy match value lowers; during 95%-75% fuzzy match interval, cognitive effort continues to increase, but not as significantly as that in the 100%-95% fuzzy match interval; in 75%-55% fuzzy match interval, cognitive effort slowly increases with a slower rate of increase compared to that in 95%-75% fuzzy match interval; during 55%-0% fuzzy match interval, cognitive effort gently grows and begins to stabilize; translators’ cognitive effort when working with Google Translate MT is slightly lower than with manual translation and a little higher than with 64%-55% fuzzy match interval. ﹀
分类号：	H087/TP391
论文总页数：	113
参考文献总数：	44
馆藏号：	017/M2016(915)
公开日期：	2016-05-30

2016-05-24

面向受限领域的语义分析系统的设计与实现.郝瑞祥

链接

题名：	面向受限领域的语义分析系统的设计与实现
姓名：	郝瑞祥
学号：	1301221011
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
导师2姓名：	高志军
论文答辩日期：	2016-05-24
外文题名：	semantic analysis system for restricted field
关键词：	语义分析语音助手人机交互文本分类
外文关键词：	semantic analysis voice assistant text classification
论文摘要：	︿近些年来，智能手机越来越普及。手机设备和PC相比，有它的优点和不足。优点是手机设备很小巧，便于携带；不足的地方是手机的键盘输入比较麻烦，人机交互体验很差。很多商业机构都致力于改善手机的人机交互体验，来取得更大的市场份额。美国的苹果公司在2008年推出的智能手机充分利用了电容屏的触摸感应能力和多点触摸的特点，极大的提高了用户的使用体验，占据了市场的主导地位。手机相对于PC来说，有一个先天的优势，就是它带有音频传感器。在商业方面，这极大地推进了人机语音交互的发展。人机语音交互的两大难题是语音识别和语义分析理解。随着深度学习技术的发展，语音识别问题得到了一个质的飞越，这使得语义分析和理解问题更加突出。本文在语音识别的基础上，实现了一个语义分析系统，可以把用户的语音命令转成相应的手机执行动作，很大程度上提高了用户使用语音交互的体验。本文设计和实现的语义分析系统是服务于手机语音助手的，在语音识别的基础上来做语义分析和理解。相对于语音识别来说，语义分析的难度更大。从现有的一些聊天机器人可以看出，问答系统还处于一个初级阶段，给人的用户体验还比较差。本文的语义分析系统和问答系统比较类似，但是在问题领域做了限制，从而降低了问题的难度，可以在一定程度上提高用户的使用体验。在实现方面，本文首先对用户的说话文本进行相应的领域分类，再对分好类的文本按照该类别的规则方法和统计方法进行相应的提取。由于领域分类在前，所以分类的正确率直接影响了整个系统的表现。本文使用了深度学习技术CNN来做文本分类，相对于传统的文本分类方法，如SVM，最大熵等方法，高出了近10个百分点，正确率达到了96%。语义的分析和理解问题，是一个综合性的问题，一些简单的文本分类、文本聚类、文本检索等技术都不能单独很好的解决这个问题。因此，本文在系统中引入了知识图谱的方法，可以在一定程度上做一些简单的推理。文本也创新性的加入了用户画像功能，使语音助手不再是一个通用的语音助手，而更像是一个非常了解自己的私人助理。不同用户的相同说话命令，可以根据用户的不同特点转化成不同的执行结果。最终，笔者设计并实现了这一系统，并通过试验评测给出了领域分类模块的正确率和语义提取模块的正确率和召回率。﹀
外文摘要：	︿ In recent years, the popularity of smart phones is growing fast. Compared with PC, mobile devices has its advantages and disadvantages. The advantage is that the mobile phone device is very compact, easy to carry; disadvantages is that it’s difficult to enter text to the phone, the interactive experience is poor. Many companies are working hard to improve the phone's interactive experience, to achieve greater market share. In 2008, the Apple Company takes full advantage of touch-sensitive capacitive screen capability and multi-touch features, greatly improving the user experience, has dominated the market. Phone with respect to PC, it has an inherent advantage in that it comes with an audio sensor. On the commercial side, which greatly promoted the development of man-machine voice interaction. Two Problems machine voice interaction is speech recognition and semantic analysis to understand. With the development of deep learning technology, speech recognition problem has a great improvement, which makes the issue of semantic analysis and understanding become more prominent. On the basis of speech recognition, this paper realized a semantic analysis system, which can turn a user's voice commands to perform actions, greatly improved the user experience of voice interaction. In this paper, the semantic analysis system is to serve the mobile phone voice assistant on the basic of speech recognition. The semantic analysis is more difficult than speech recognition. As can be seen from some of the existing chat robot, QA system is still in a preliminary stage, the user experience is still relatively poor. This semantic analysis system is similar with QA system, but have been restricted in the problem areas, thereby reducing the difficulty of the problem, the user experience can be improved. The first step is to do field classification by the text classification, the second step is to do information extraction by the rule system and the statistical system. The information extraction is based on the field classification, so the field classification is more important. In this paper, we use the Convolution neuron network to do the text classification. With the respect to traditional text classification methods such as SVM, the maximum entropy method, the accuracy rate reached 96%, up nearly 10 percent. Semantic analysis and understanding problem, is a comprehensive problem, text classification, text clustering, text retrieval technology cannot solve this problem very well. Therefore, we introduces Knowledge Graph technology to our system, it can do some simple reasoning problem. We also do some work on the User Profile, so the system can give different reaction to different person, it is like a personal assistant. Ultimately, the author designed and implemented the system, and gave the correct evaluation of the field classification and the accuracy and recall rate of semantic extraction module. ﹀
分类号：	TP
论文总页数：	43
参考文献总数：	0
馆藏号：	017/M2016(547)
公开日期：	2019-05-24

基于层级模型的文本分析及其应用.李茹蒙

链接

题名：	基于层级模型的文本分析及其应用
姓名：	李茹蒙
学号：	1401210609
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-24
关键词：	文本分析层级结构模型动态文摘时间线生成文本主题分析
论文摘要：	︿文本结构的分析在自然语言处理中起着十分重要的作用。近年来层级模型被广泛应用于文本结构分析中。层级模型，将文本划分为不同的层次。这种划分可以是基于自然的粒度，如词、短语、句子、段落和篇章；也可以基于话题：如一级话题，二级话题，三级话题等。本文尝试通过三种不同层级结构模型，对文本结构进行分析，并将分析的结果应用到实际应用任务中去。本文分别对动态文摘、时间线生成及文本主题分析任务进行了初步的探索，并获得了较好的效果。动态文摘任务：与传统的多文档自动文摘任务相比，结合了重要性排序和新话题发现两大难点。本文使用层级潜在Dirichlet分布(Hierarchical Latent Dirichlet Allocation)模型分析文本内部的主题结构，从模型构建的树结构中可以辨别出历史文档集和当前文档集中相似和不同的内容，不仅有助于当前文档集中新话题的发现，也能兼顾历史文档集中旧主题在当前文档集中的发展。基于分析的结果设计了一种新的摘要生成算法，使获得的文摘具有代表性，新颖性及非冗余性。新闻语料的时间线生成任务：对一个长期发展的新闻事件，主题通常会在不同的时间段内包含许多不同的副主题，而每个副主题又有自身的发展过程。现有的方法并没有很好的利用这样的层级主题结构。本文提出了一个新的依赖于时间的层级Dirichlet树模型来检测语料中不同层次的话题信息，全面考虑相关性、连贯性和覆盖性挑选句子生成时间线。本文在多个公众长期关注的新闻事件语料上进行了广泛的实验，证明了所提模型的有效性。文本主题分析任务：检测一篇文本包含的主题及主题的演变过程，包括主题识别，主题的边界确定，主题间关系剖析等任务。本文着重于主题识别的研究工作，提出了一个新的基于神经网络模型的主题分析方法，在PubMed上的实验首次探索了神经网络模型在该任务上的表现。﹀
分类号：	TP3-05.
论文总页数：	60
参考文献总数：	0
馆藏号：	017/M2016(549)
公开日期：	2016-05-24

基于“写长法”的英语写作计算辅助技术研究.刘艳珣

链接

题名：	基于“写长法”的英语写作计算辅助技术研究
姓名：	刘艳珣
学号：	1401210655
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-24
关键词：	写长法计算语言学词语推荐句子推荐
论文摘要：	︿从许多历年毕业生身上可见，英语学习者用语言来表达思想是很困难的一件事，即使是英语专业学生在面对英文长文写作时都是如履薄冰，难以顺利完成，对于非英语专业同学而言更是如此。学习英语的最终目的是运用和交流，体现在写和说方面。写是英文学习综合能力的体现，较说容易一些，促学效果更好，因而应该予以重视。 “写长法”是一个关于写作方面的外语教学理念，它的基本思路是：设计能够刺激学生表达欲望的写作任务，逐渐要求增加写作字数，使学生突破外语学习的困境，从而增加写作的成就感和满足感，促使将知识转变成实用能力的过程。该理念已运用于许多高校的英语课堂教学中，获得了良好效果和广泛好评。在信息化和互联网发展蒸蒸日上的今天，如果在计算机上运用“写长法”来辅助英文学习者进行写作，则一方面可以减轻教师的指导压力，另一方面利用大量电子化资源帮助学生“写长”。围绕该理论的关键点和目的，本文从计算语言学中的词语和句子层面，创新性地研发了一个基于“写长法”教学理论的英文辅助写作系统，在学生原有作文的基础上推荐相关并具有启发性的词汇和句子，促进学习者增加词汇量，学习新句型，拓展写作思路，从而提高学习者的外语能力。由于高中是非英语专业学生学习英语较集中、语法使用相对规范而且有大量潜力未被发掘的时期，本文主要针对高中英语学习者设计了一个基于“写长法”的自动化辅助英文写作系统，主要研究内容如下：词语推荐的关键技术和实现方法。本文系统的核心内容之一是对于作文中使用的主要词汇推荐难度适宜的同义词，对此采用了基于语义词典和基于点互信息余弦夹角的融合方法生成计算词语相似度，排除了偏难词汇并使用基于词频的方法进行难度计算，最后利用有监督学习的方法生成了词语推荐的优先度模型，按照优先度顺序进行词语推荐。句子推荐的关键技术和实现方法。通过推荐原句相近且质量高的句子，用户可以直接运用或受到连环启发从而促进写长。本文提出了基于搜索引擎的相关句获得方法，利用句子的语义相似度计算方法，使用机器学习的方法对句子质量做了评价，用梯度下降法寻找到了最优融合参数，实现了基于句子相似度和句子质量评价的推荐方法。系统整体框架和各个模块的设计。系统的框架模块包括界面，服务器，推荐模块服务程序，核心算法模块以及评价反馈模块。对于系统两大功能和系统性能的评价,本文设计了评价问题并采用用户人工打分的方式。﹀
分类号：	TP312
论文总页数：	61
参考文献总数：	0
馆藏号：	017/M2016(628)
公开日期：	2016-05-24

基于循环神经网络的端对端文本蕴含识别.王旭光

链接

题名：	基于循环神经网络的端对端文本蕴含识别
姓名：	王旭光
学号：	1301210977
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-24
关键词：	文本蕴含循环神经网络自然语言推理问答系统
论文摘要：	︿文本蕴含识别是指：判断是否可由一段自然语言文本推出另一段自然语言文本，它是最能反映自然语言理解的任务之一。不同于形式逻辑推理，文本蕴含考查的是对于模糊的，有歧义的，表达形式多变的自然语言文本的推理能力。目前，文本蕴含识别有基于形式逻辑的方法；基于启发式规则的方法；基于特征工程加机器学习的方法；基于最近兴起的深度学习的方法。不同于上述方法，本文提出的文本蕴含识别模型，不利用规则和形式逻辑，不构造人工特征，而是考虑文本间的相互作用构建一种端对端的（输入是原始自然语言文本，输出是判别结果）基于LSTM（Long-Short Term Memory）的深度神经网络DBCLstm。与其他深度学习方法相比，本文的模型可以得到待识别文本对的高层次交互（interaction）的抽象表示，这对于文本蕴含识别而言至关重要。为了验证本文方法的有效性，我们首先在公开测评数据集SICK（Sentences Involving Compositional Knowledge）上做了一组实验，测试集上的最高正确率为82.04%，超过了斯坦福自然语言处理组的实验结果；其次我们在公开的大规模文本蕴含数据集SNLI（The Stanford Natural Language Inference Corpus）上进行实验，最终结果超过了提取特征加分类器的方法及大部分句子编码（sentence encoding）类方法。最后，我们论述了文本蕴含识别与问答系统的关系并将本文的模型应用于中文事实类问答系统。问答系统的任务目标及文本风格皆异于前两个实验，它是本文模型完整的一个应用场景，其数据集的构建，数据标注质量的控制，对比实验的设计均由我们自己完成。最终的对比实验结果显示，我们的各项测评指标均超过了传统的特征提取加分类器的方法。﹀
分类号：	TP3-05
论文总页数：	48
参考文献总数：	0
馆藏号：	017/M2016(677)
公开日期：	2016-05-24

基于机器学习的智能英语教材编纂系统.王志伟

链接

题名：	基于机器学习的智能英语教材编纂系统
姓名：	王志伟
学号：	1301210988
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-24
关键词：	篇章难度排序英语教材编纂
论文摘要：	︿当前英语学习主要有两个突出的问题:一是个性化问题,对于不同能力不能水平的学生采用相同的教学方式和方法,没有做到因材施教,同时在知道学生最初能力之后, 制定了学习计划,但是在后期没有根据学生学习能力和知识的增长来自适应地调整学习计划;二是教育公平问题,对于家庭条件好处于一线城市的学生来说,通常可以享用较好的英语教学资源,而贫困或者一般家庭则没有办法享受到。为了解决这两个突出的英语学习问题,一些互联网教育公司研发了基于机器学习的自适应英语学习系统。对于教育公平问题,互联网本身的共享性质就解决了这个问题,具体来说,背后的基于大数据的机器学习引擎对于所用用户都是一样的;对于解决个性化与自适应问题, 自适应系统首先会通过严格科学地测试来检测用户当前学习能力和英语水平,然后个性化地定制英语学习路线,同时在学习过程中系统会根据用户学习进度和掌握水平进行个性化动态调整和制定路线,然后选择对应难度的英语教学材料,这就需要有一个海量的具备难度排序的英语教学材料集合,目前尚没有成熟的适应于自适应英语学习的编纂系统。本文研究的目标是提出一个适应于自适应英语学习系统的智能英语学习教材编纂系统的解决方案,该方案包括英语文章网络爬虫与预处理子系统、篇章特征抽取子系统和篇章排序子系统。该系统包含离线和线上两部分,离线根据标准英语教材材料训练难度排序系统,线上会实时对爬虫获取的候选阅读材料进行排序。最后输出带有不同维度标注的难度有序的英语阅读材料库。本文所做工作如下:1.结合 Coh-Metrix 和英语词汇教学,研发了篇章特征抽取系统 FeatureMagic; 2.研发基于 Learn to Rank 的英语阅读文章难度排序系统; 本文所研究的基于机器学习的智能教材编纂系统是国内外第一个适应于自适应英语学习的基于机器学习的智能英语教材编纂系统,该系统编纂的英语教材从描述性统计、指代衔接、潜在语义分析、词汇多样性、句法复杂度、句法模式密度、词信息和词表覆盖率等角度对英语材料进行了严格科学的标注和难度排序。本文研发的解决方案已经成功地应用于北京大学俞敬松老师研发英语作文自动打分和自适应英语学习系统,同时在也成功地运用于 2012 级吕京的英语教学研究中。﹀
分类号：	TP181
论文总页数：	51
参考文献总数：	0
馆藏号：	017/M2016(765)
公开日期：	2016-05-24

Query切分及其在相关性排序上的应用.卢刘杰

链接

题名：	Query切分及其在相关性排序上的应用
姓名：	卢刘杰
学号：	1301210825
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-24
外文题名：	Query Segmentation and Its application in Relevance Ranking
关键词：	Query切分相关性排序 LambdaMART 搜索引擎
论文摘要：	︿目前流行的商业搜索引擎如Google、Sogou与用户之间的交互主要是以Query方式进行的。用户向搜索引擎提交的Query常常包含一些复合概念及短语，这些概念和短语充分表明了用户想要查询的主题，然而大多数用户却并没有对复合概念及短语进行完全匹配检索的意识。因此为了捕捉用户在Query中关注的主题、改善搜索结果，提升用户对于搜索引擎的满意度，在搜索引擎对用户Query进行正式检索之前，进行Query切分便显得异常重要起来。Query切分不仅能够帮助搜索引擎更精准地理解用户查询意图、提升检索结果，其对语义检索模型、基于复合概念及短语的查询重构等工作也大有裨益。正因为Query切分在上述工作中的作用，其在现代信息检索领域中一直是很重要的一部分工作。前人以识别用户Query中的概念及短语为目的进行了大量研究工作，研究者们还探究了如何利用Query切分的结果以提升相关性排序。基于上述背景，在前人工作的基础之上，本文主要完成了以下工作：(1) 之前的Query切分研究主要是在英文环境下进行的。而据本文作者所知，中文环境下的Query切分研究目前尚属空白。针对此问题，本文创建了第一个用于Query切分研究的中文数据集。(2) 针对Hagen等人提出的Query切分打分函数的不足，本文设计了新的Query切分打分函数。新的Query切分打分函数借助于更丰富的语料资源、可调试的语料类型加权因子、概念词条激励因子以及粒度控制因子，实现了更灵活可控的Query切分。(3) 本文探究了重排序模型下不同特征集合的表现。实验表明，通过在原有特征集合中增加切段搜索次数、名词数目以及紧密度等特征可以在一定程度上提升Query切分的准确率。(4) 在将Query切分结果应用于相关性排序方面，前人的实验主要是在模拟的搜索引擎环境下进行的。而本文将Query切分应用于真实的线上相关性排序环境，进行了一系列更具说服力的实验。实验结果表明，利用Query切分结果可以有效提升相关性排序，相关性排序的提升与Query切分正确率在一定程度上存在正相关的关系。﹀
外文摘要：	︿ Currently popular commercial search engines such as Google, Bing interact with users mainly by query. Query submitted by search engine user is often comprised of some composite concepts and phrases. These concepts and phrases fully indicate the subject that the user wants to query, but most users don't have the retrieval consciousness of completely matching these composite concepts and phrases. Therefore in order to capture the subject that the user pays attention to in the query, improve search results and enhance the user's satisfaction with the search engine, query segmentation before formal retrieval for user query appears to be very important. Query segmentation can not only help search engine understand the user's query intent more accurately, improve search results, but also benefit semantic retrieval model and query reformulation based on composite concepts and phrases. Precisely because of the role of query segmentation in the above work, it has always been a very important part of the work in the field of modern information retrieval. Researchers have done a lot of work to identify the concepts and phrases in user query. Researchers also explore how to use query segmentation to enhance relevance ranking. Based on the above background, on the basis of previous work, this paper mainly completed the following work: (1) Previous query segmentation research was mainly conducted in the English environment. However, as far as we know, query segmentation research in the Chinese environment is still blank. To solve this problem, this paper creates the first Chinese data set for query segmentation research. (2) Aiming at the deficiency of query segmentation scoring function proposed by Hagen et al., this paper proposes a new query segmentation scoring function. With the help of more abundant corpora resources, adjustable corpora type weighting factor, concept term motivating factor and granular control factor, new query segmentation scoring function achieves a more flexible and controllable query segmentation. (3) In this paper, we explore the performance of different feature sets in reranking model. Experiments show that the accuracy of query segmentation can be improved by adding features such as segment search times, number of nouns and tightness to original feature set. (4) In terms of applying query segmentation to relevance ranking, previous experiments are mainly conducted in simulated search engine environment. However, in this paper, query segmentation is applied to real online relevance ranking environment and a series of more convincing experiments are carried out. Experiments show that query segmentation can effectively improve relevance ranking performance and there is a positive correlation between relevance ranking improvement and the accuracy of query segmentation. ﹀
分类号：	TP391.1
论文总页数：	49
参考文献总数：	41
馆藏号：	017/M2016(819)
公开日期：	2019-05-24

面向多轮对话的语句建模研究.薛卉

链接

题名：	面向多轮对话的语句建模研究
姓名：	薛卉
学号：	1301211030
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-24
外文题名：	SENTENCE MODELING BASED ON MULTI-TURN FOR DIALOGUE SYSTEM
关键词：	多轮对话语句建模神经网络
外文关键词：	Multi-turn Sentence modeling Deep learning
论文摘要：	︿在开放领域的对话系统中，准确理解用户想表达的语义,是一个非常重要、难度很大的任务。用户想表达的语义空间可能非常大，话题之间也可能存在跳跃。一旦系统可以准确捕捉和理解用户的语义时候，意味着系统可能可以准确理解用户情感倾向、识别用户潜在意图甚至对用户的意图和动作进行引导、并生成准确流畅的回答。当前的研究表明，语句建模的主要问题和困难在于:第一，很多研究仍然基于命名实体、句法分析等需要大量人工标注的语料或者知识库进行建模，需要大量先验的知识。第二，很多语句建模的方法，都是基于冗杂的特征工程，缺乏一个完整的建模思路。第三，现在对话系统基本都基于单轮对话建模，没有完全利用语句间的关系对语句进行建模，忽略了语句间关系的有利信息。近几年，深度学习在自然语言处理的任务上取得了不错的成果，使我们有理由相信，适合的网络结构具有表征句子语义空间和抽象句子、理解句子的能力。本文尝试结合多种神经网络结构，构建在多轮对话的场景下的语句建模和表达。本文的工作主要在于：开创性的提出了特殊的网络结构，用于构建在多轮对话的情境下的语句建模。首先，本文通过将模型的中间结果进行分析，寻找其中的语义线索，从而分析模型的可解释性。其次，本文在来源真实、采样均匀的中文的数据集上进行评测。和传统的机器学习方法相比有了不少的提高，这意味着该模型方法具有在不规则的文本上构建上下文相关的对话系统的潜力。﹀
外文摘要：	︿ It’s not only difficult but also important to understand what the users want to express by sentences in an open domain dialogue system. We could catch the user sentiment correctly,identify the user intentions and generate a smooth and perfect response to users by accurately capture the sentence semantics. However, there are some problem lies in sentence modeling: 1) Many study about sentence modeling are still based on name entity recognition, syntactic analysis and other method that need large manual standard corpus or require a lot of prior knowledge.2) It means that we lack of a whole system design and we must focus on feature engineer which spend us lot of time. 3) Most dialogue systems build the sentence modeling based on single turn, it means that we might be ignore the relationship between consecutive sentences. In recent years, deep learning achieved excellent performance in some tasks of natural language processing, so we have reason to believe that the deep enough neural networks have the ability to construct the sentence modeling. This paper attempts to construct multi-turn dialogue with different neural networks architecture. This paper proposes a deep neural network to address the conversation modeling problem by end-to-end learning, so as to help to select conversationally relevant responses. First, our model is able to obtain both the local key phrases within the sentences and the sequential semantic of the conversation. The experimental results on the dataset from the Chinese SNS have shown that our approach is promising for modeling conversations composed of short texts with informal styles and has the potential to be taken to build the context-aware chatting systems. ﹀
分类号：	TP3
论文总页数：	63
参考文献总数：	31
馆藏号：	017/M2016(857)
公开日期：	2019-05-24

基于语料库的国际汉语学生中文辅助写作系统.胡亮

链接

题名：	基于语料库的国际汉语学生中文辅助写作系统
姓名：	胡亮
学号：	1301210662
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-24
关键词：	语料库辅助写作系统词向量主题模型
论文摘要：	︿随着中国国际影响力的日益提高，越来越多的母语为非汉语的国际学生开始学习中文。而在语言学习过程中，写作往往是其中的一个重要环节，但这对他们来说通常构成了一个具有挑战性的任务。针对这样一个群体，本文提出了一个基于语料库的中文辅助写作系统，在充分挖掘地道的中文写作语料信息的基础上给学生们提供写作参考和辅助。在前人的辅助写作系统研究中，人们往往将关注点放在词汇和语言级别，而对内容级别的挖掘以及文章主题关注较少，并不能有效解决学生的写作素材缺乏和主题思路缺乏等问题。另外，以往的辅助写作系统往往针对母语为非英语的英语学习者，而很少关注中文写作尤其是国际学生的中文写作问题。为此，本文系统分析前人的研究成果，总结其优缺点，设计并实现了基于语料库的国际汉语学生中文辅助写作系统。首先，本文实现了一个阶段式的爬虫框架，获取大量真实可靠的作文语料，然后按照系统设计需求从词级、句子级和主题思路级对作文语料进行了挖掘和分析。在词级语料处理中，我们统计分析了词频、词性信息，训练了词向量模型。特别地，本文提出了一个基于 HSK 词表的扩展词汇等级计算方法和基于 i+1 理论的词语扩展方法。在句子级语料处理中，在句法分析的基础上获取了句子结构主干，在词向量的基础上计算了句子向量。在主题思路级别，使用了无监督的 LDA 主题模型对作文进行了主题建模，并计算了主题向量。然后，探讨了系统核心模块的关键技术，并依此设计和实现了词、句子和主题思路三个级别的扩展模块。在词扩展模块，采用先扩展后约束的方法。首先采用基于语义词典和词向量模型两种方式获取候选扩展词语的全集，然后从词频、词性、上下文语境和学生词汇水平等不同维度对候选词语进行综合排序。在句子扩展模块，采用从结构和内容两个层面进行相似度扩展，提出了基于依存结构的句子结构相似度和基于词向量的句子内容相似度计算方法。在主题思路扩展模块，借鉴句子向量的方式对主题向量计算作出了探索性尝试，取得了较好的效果。最后，设计并实现了一个中文辅助写作系统，并对系统进行了评价。该系统是一个使用 Flask 框架搭建的 Web 应用，包含了词、句子和主题思路扩展三个模块的具体实现过程。最后，我们邀请志愿者对系统进行了测试使用，并对各个功能模块以及系统整体性能进行打分。评价结果显示，系统扩展模块效果良好，整体性能对学生写作水平的提升也有积极的作用，符合预期的效果。﹀
分类号：	TP311.52
论文总页数：	56
参考文献总数：	0
馆藏号：	017/M2016(1025)
公开日期：	2016-05-24

2016-05-23

成分分析和函数主目分析：分歧与融合.刘骁萱

链接

题名：	成分分析和函数主目分析：分歧与融合
姓名：	刘骁萱
学号：	1301213010
专业：	外国语言学及应用语言学
公开时间：	公开
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2016-05-23
外文题名：	Constituent Analysis and Function-Argument Analysis: Divergence and Convergence
关键词：	成分分析函数-主目分析范畴语法生成句法
外文关键词：	Constituent Analysis Function-Argument Analysis Categorial Grammar Generative Syntax
论文摘要：	︿成分分析和函数主目分析是两种相去甚远的句法分析方法。成分分析的来源可追溯至索绪尔对语言符号的定义和语言系统的研究对象。布隆菲尔德在索绪尔思想基础上发展了直接成分分析法，之后乔姆斯基建立了转换生成语法的最初模型、短语结构语法。函数主目分析方法可追溯至弗雷格，他将数学中的函数思想运用到命题语言分析中，加之罗素建立的类型论，以函数主目运算为操作的范畴语法得以发展。这两种分析方法乃至其各自代表的思想在很多句法理论中共存，偏重各异，如直接成分分析法很大程度上是成分分析，而其二分思想可以说与函数分析不谋而合；范畴语法以函数-主目分析为基础进行句法运算，同时也沿用了传统意义上和成分分析中的句法范畴。本文在梳理成分分析和函数主目分析两种方法的开端和发展的基础上，探究分别以二者为分析方法的有代表性的句法理论，直接成分分析法和范畴语法，并研究在生成句法理论中分析方法的转变，主张函数-主目分析和组合原则能补充发展最简方案对语义的解释，显示出两种研究方法的融合。﹀
外文摘要：	︿ Among theories of syntactic analysis, Constituent Analysis and Function-Argument Analysis appear to be two fundamentally different approaches to syntax of natural language. Respectively typical theories are Immediate Constituent Analysis and Phrase Structure Grammars on the one hand and Categorial Grammars on the other. This thesis attempted to trace the origin and essence of Constituent Analysis and Function-Argument Analysis to Saussure and Frege, to examine the development of Constituent Analysis focusing on Bloomfield’s Immediate Constituent Analysis and Chomsky’s Phrase Structure Grammar, and then to survey that of Function-Argument Analysis via Categorial Grammars, with the aim of identifying their divergence. Through a review of the change in analytic method toward the X-bar Theory and the Minimalist Program, this thesis proposes that Function-Argument analysis supplement and promote the semantic interpretation of the Minimalist Program, and that the cooperation and convergence of the two approaches is attainable. ﹀
分类号：	H0-06
论文总页数：	72
参考文献总数：	0
馆藏号：	039/M2016(18)
公开日期：	2016-05-23

2016-05-22

L市公安局民警绩效考评提升的研究与应用.席海龙

链接

题名：	L市公安局民警绩效考评提升的研究与应用
姓名：	席海龙
学号：	1301221700
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2016-05-22
关键词：	公安民警公安绩效考核 360度考核法
论文摘要：	︿随着科学技术的发展，时代的进步，公安部门现有的绩效考核机制已经不能满足全面考核民警综合素质的功能需求，尤其是“十八届五中全会”后，公安机关绩效考核信息化的应用得到了公安部和相关专家学者的重视。因此如何建立起统一、高效、全面的绩效考核管理信息系统，提高公安民警的工作效率以及为领导进行人员分配提供有力依据显得尤为重要。我国现阶段的公安民警考核方式还很单一，如只依据月底、年终的拘留数或破案数量等统计或单纯将领导的认同作为绩效考核的得分，将大大影响考核结果与实际情况的真实性和有效性；其次，现有的公安考核系统种类繁多，且都较为分散，缺乏一个统一的标准。从而导致现有的考核系统不能有效的反映出每名民警的自身特点，也就不能及时发现每位民警自身需要改善的问题，为领导选人用人提供良好依据。 L市的公安考核系统是我国现有公安考核系统的一个缩影，所以本文针对L市的公安考核系统存在的漏洞进行如下分析，探讨并创建出了一套综合民警绩效考核系统。该绩效考核系统本着以公安民警为中心，以方便实际操作、提高警务效率为目的，严格按照面向对象的个人能力进行需求分析、系统设计。若将该系统用于公安管理工作中，就能更加全面的了解警员信息，激发民警投身公安工作的积极行，大大提高公安绩效考核管理效能。本文在了解绩效考核理论、中西方公务员绩效考核发展、警察绩效考核理论基础上，全面分析了当前公安绩效考核现状，存在的问题，对公安工作的影响，以及产生问题原因。通过对5种绩效考核方法优缺点的比较，结合L市公安民警绩效考核中的诸多问题，提出了以360度考核法为框架的新的公安绩效考核优化体系，以实现对L市公安局民警绩效考核管理系统的优化，进而对民警进行更全面深入的研究分析，并且通过对民警客观能力的评定，为民警个人改善问题提供出路，为领导落实岗位职责提供有力依据。﹀
分类号：	TP399
论文总页数：	62
参考文献总数：	0
馆藏号：	017/M2016(141)
公开日期：	2016-05-22

警用标准地址数据管理和服务系统的分析与设计.王一骄

链接

题名：	警用标准地址数据管理和服务系统的分析与设计
姓名：	王一骄
学号：	1301221678
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	信息科学技术学院
论文答辩日期：	2016-05-22
关键词：	地址编码分词算法特征词抽取地址模型标准地址管理系统
论文摘要：	︿地址是公安信息“人、地、物、组织、事件”五元素之一，在公安工作中发挥着基础作用；地址在社会管理、智慧城市建设、公众信息服务等领域均具有广泛的用途，是现代社会所必需的战略性基础信息资源。通过标准地址模型建设，使地址标准化，成为业务上图和业务关联的桥梁，是一项具有重大意义的基础工程。警用标准地址数据管理和服务系统是在全国统一的技术规范指导下，基于地址空间数据库，为全国公安部门提供地址服务的重要公安信息化应用系统。本文的主要工作和取得的成果如下： 1.提出基于关联规则自适应地址模型的构建方法，完善地址要素分类，解决了地名描述要素与结构的规范化表达问题，并进行了实验验证。 2.针对当前特征词提取的不足提出了新建特征词库，并采用了基于统计的分词算法，并对其进行了测试分析。 3.针对地址表述的多样性和空间唯一性特点提出空间编码和属性编码的双重地址编码，将地址编码统一，并对此编码进行测试。 4.构建了标准地址数据模型，借助空间编码实现了地址数据和历史数据的关联，在此基础上对系统进行需求分析，并提出系统的设计方案，完成系统的详细设计，系统功能包括地址采集、地址逆向采集、地址更新以及地址匹配等功能。 5.实现了设计的系统，通过引入成熟的设计标准实现了所需要的研究成果。并进行了系统测试与维护。﹀
分类号：	TP
论文总页数：	56
参考文献总数：	30
馆藏号：	017/M2016(829)
公开日期：	2016-05-22

2015-06-10

俄语汉译与俄语英译中文化补偿策略对比研究.张欢

链接

题名：	俄语汉译与俄语英译中文化补偿策略对比研究
姓名：	张欢
学号：	1201210979
专业：	软件工程（二级学科名称）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-06-10
外文题名：	A contrastive study of cultural compensation strategy in translating Russian into Chinese and English
关键词：	文化缺省功能对等俄汉翻译俄英翻译补偿
外文关键词：	Cultural Default Functional Equivalence Russian-Chinese translation Russian-English translation Compensation
论文摘要：	︿由于各国文化的特殊性，文学作品的原作内容中往往蕴含着独特的文化现象。各国、各民族独特的文化内涵，给翻译过程带来了极大的困难。在翻译活动中，翻译缺失(переводческая потеря)是无法避免的。在著名的“空缺理论”中，俄罗斯学者Ю.А.Сорокин和И.Ю.Марковина提出了两种基本方法：填空法和补偿法。补偿法无疑是消除文化缺省现象最有效的方法之一。国内学者还提出了一系列方法，如：音译法、注释法、语义再生法、借用法、代换法等不同方法。通过对比分析国内外学者对文化空缺与补偿策略的研究可见，国内关于翻译补偿的研究直至2006年以前一直处于零散的理论点讨论阶段，没有完整的理论体系的构建。夏廷德在《翻译补偿研究》一书中首次对翻译补偿做出全面而系统的理论体系构建，建立了以“八大分类、六项原则和两个层面”为主要框架的理论体系。但是国内学者关于翻译补偿的研究都是针对英汉翻译为主，而对俄汉翻译的补偿策略研究则基本空白。因此，本文尝试以尤金·奈达提出的著名“动态对等”翻译理论为基础，提出从功能对等的四个层面分别探讨俄汉翻译的补偿策略假设，形成完整的以词汇对等、句法对等、篇章对等和文体对等层面的俄汉翻译补偿策略体系探讨，并初次尝试将俄英翻译、俄汉翻译中的补偿策略进行同步对比分析，试图把英文译本的文化补偿策略中值得借鉴之处，吸收并应用到俄汉翻译中来，从而实现汉语译本对原文的忠实性再现。首先，在词汇对等层面，本论文归纳出三类文化缺省现象，并提出：采用“功能替换/属性归纳”的补偿策略，可以避免文化负载词的翻译缺省现象；采用“音译，加词汇标注”的补偿策略，可以避免专有名词的缺省现象；采用“译出缩略语所指代的完整单词在目标语中对应的翻译”的补偿策略，可以避免缩略语的缺省现象。其次，在句法对等层面，本论文就两类文化缺省现象提出补偿策略：采用“补充句子成分/句子重构”的补偿策略，可以避免俄语汉译后句子成分残缺，以致语义不完整的现象；采用“替换法/转换法”的补偿策略，可以避免俄语原句中使用比喻、拟人、反问等修辞手法所导致的翻译缺省现象。再次，在篇章对等层面，本论文提出采用“归化的翻译策略”，增/删/改特定的句子成分，可避免该层面的文化缺省现象，从而实现源语言与目标语在语篇层面的动态对等。最后，在文体对等层面，提出采用“意译的翻译策略”，可避免对异文化中隐藏的特殊感情色彩的翻译缺失，从而实现文体风格的一致。本论文的研究对象是俄语原著Один День из Жизни Ивана Денисовича，作者是索尔仁尼琴（1918-2008），俄罗斯作家，1970年获诺贝尔文学奖。代表作有中篇小说《伊凡·杰尼索维奇的一天》等。论文将对该著作的英文版译作One Day in the Life of Ivan Denisovich与汉语版译作进行对比分析，探讨在俄语英译、俄语汉译过程中，对文化缺省现象，应当如何采取不同的补偿策略。本论文的研究有助于读者更好地理解索尔仁尼琴的原著作品，通过吸收英译本中值得借鉴的文化补偿策略，提出对现有汉译本的改进建议，也为俄语的汉译探索一套新的翻译策略，从而为俄语英译、俄语汉译的研究提供参考价值。这将有利于译者在对异文化文本进行翻译时灵活运用各种消除空缺的方法，最大程度地为读者呈现源语言文化特点。本研究不仅对翻译有一定的理论和实践意义，对促进不同语言文化间的交流也会起到一定的启迪作用。﹀
外文摘要：	︿ Due to the special cultural, original literary works often present unique cultural phenomenon. Unique culture in different states and nations brought great difficulties to the translation process. In translation activity, lack of translation (переводческая потеря) cannot be avoided. In the famous "Gap Theory", the Russian scholar Ю.А.Сорокин and И.Ю.Марковина presents two basic approaches: Blank Filling and Compensation. Compensation method is without doubt one of the most effective methods for bridging cultural gaps. Scholars in China also put forward a series of methods, such as: transliteration，semantic method, the method of borrowing, the method of substitution and so on. Through comparative analysis of the study on cultural gaps and compensation strategies by domestic and foreign scholars, domestic research on compensation has been in sporadic theoretical discussion phase before the year of 2006, during which no complete theoretical system had been formed. The first one who put forward a full and systematic theoretical system is Xia Tingde. In his book A Study on Compensation in Translation, he proposed a system which takes "eight categories, the six principles and two dimensions" as main framework. However, domestic researches on translation compensation are mainly English-Chinese translation and study on compensation strategy in Russian-Chinese translation is basically blank. Therefore, this paper attempts to discuss compensation strategies in Russian-Chinese translation from four levels: Word Equivalence, Syntactical Equivalence, Chapter Equivalence and Stylistic Equivalence, which is on the basis of the famous “Dynamic Equivalence” translation theory, proposed by Eugene, and propose a full system of translation compensation strategies of these four levels. This paper first attempts to give a synchronous analysis of compensation strategies in Russian-English translation and Russian Chinese translation, and proposes to draw lessons from compensation strategies in Russian-English translation so as to achieve faithful reproduction in the process of Russian Chinese translation. From the level of Word Equivalence, this paper concludes three types of cultural default phenomenonand proposed: as to the cultural default of culturally-loaded lexemes, translators should adopt the compensation strategy of "Functional Replacement/Attribute-Oriented Induction"; as to the cultural default of Proper Nouns, translators should use the compensation method of "transliteration with annotation"; as to the cultural default of abbreviations, translators should take the compensation strategy of "Confiriming what every letter in an abbreviation represents in the source language, and then translate its corresponding expression in the target language". From the level of Syntactical Equivalence, this paper discusses two aspects of cultural defaultand proposed: as to the Russian viersion of incomplete sentences, translators should adopt the compensation strategy of "supplementing sentence elements/ sentence reconstruction"; as to the cultural default caused by the sentences that contain figures of speech, translators should use the compensation method of “replacement/ conversion”. From the level of Chapter Equivalence, this paper discusses “the domestication strategy” and advises translators to add / delete / change specific sentence elements, in order to achieve dynamic equivalence between the source language and the target language in the discourse level. From the level of Stylistic Equivalence，this paper proposes: as to translating special emotional meaning that ishidden in foreign culture, such as disgrace, disgust, innocence, tragedy and so on, translators should select the “the domestication strategy”, change ways of expressing but retain the same emotion in the original text so that the reader can easily understand, in order to achieve stylistic consistency and compensation. This dissertation discusses the Russian novel: Один День из Жизни Ивана Денисовича, written by Solzhenitsyn (1918-2008), Russian writer, received the 1970 Nobel Prize in literature. The novellas Один День из Жизни Ивана Денисовича is his magnum opus. This dissertation compares the use of compensation strategies in its English version One Day in the Life of Ivan Denisovich and its Chinese version《伊凡·杰尼索维奇的一天》,and discusses how to adopt different compensation strategies in Russian-English translation and Russian Chinese translation in order to eliminate the phenomenon of cultural default. The research in this thesis will help readers have a better understanding of the original works of Solzhenitsyn, and put forward suggestions for improvement to the existing Chinese versions by absorbing compensation strategies in English version, as well as explore a set of new translation policy in the process of Russian-Chinese translation, which provided research value for the Russian-English translation and Russian-Chinese translation. This would help translators, in the translation process of foreign culture texts, use a variety of methods to eliminate vacancies and maximize the source language and cultural characteristics for the reader. This will not only have certain theoretical and practical significance for translation research, and will also play a certain role in enlightening for communication between different languages and cultures. ﹀
分类号：	H087/TP391
论文总页数：	82
参考文献总数：	44
馆藏号：	017/M2015(385)
公开日期：	2015-06-10

2015-06-01

技术文档英汉翻译中模糊语的处理方法研究#xB;—以《ViewletBuilder 版本7专业版用户手册》为例.李雨

链接

error

2015-05-31

市场营销中品牌名称的汉译策略——以《富人消费者：奢侈生活方式的营销与销售》为例.李小溪

链接

题名：	市场营销中品牌名称的汉译策略——以《富人消费者：奢侈生活方式的营销与销售》为例
姓名：	李小溪
学号：	1201210670
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-31
外文题名：	The Translation Strategy of Brand Name in Marketing---A Case Study of Translation of The Affluent Consumer: Marketing and Selling the Luxury Lifestyle
关键词：	品牌名称市场营销汉译策略
外文关键词：	brand name marketing translation
论文摘要：	︿本翻译的研究书籍题为《富人消费者：奢侈生活方式的营销与销售》，由美国学者罗纳德·D·米彻曼与爱德华·M·麦兹合著完成，2006年出版。著者基于多年教学经验及实际案例，全面解析了二十世纪末、二十一世纪初美国的奢侈品消费及营销策略。相信这将为近年来出现了新富阶级、正跻身世界奢侈品消费大国行列的中国提供一定的理论和实际参考。书中出现了大量的品牌名称，但是由于市场营销领域品牌名称汉译的研究译例重复较多，其汉译又较为复杂，因此成为了笔者翻译实践中所要重点考虑的问题。笔者通过学习市场营销领域的相关知识、比对同类译著、以及翻译《富人消费者》一书，针对市场营销中品牌名称的处理提出两大翻译原则：遵循我国的商业法律法规和恪守我国的道德规范。在上述原则的指导下，笔者将品牌名称分为已有官方译名的品牌名称以及尚无官方译名的品牌名称，基于前人的研究与实际译例，总结出已有官方译名的品牌名称的翻译策略，即音译法、直译法、意译法、音意结合法与零翻译。随后针对尚无官方译名的品牌名称仿照以上翻译策略，另外参考已有的流行译名，给出自己的翻译。并通过实例对策略进行阐述，说明策略的有效性。最后，本文对书中的品牌名称进行归类制表，以期为同类书籍的汉译提供参考。笔者旨在深入挖掘此类书籍中品牌名称汉译的策略，填补此领域汉译研究中的缺漏，凸显品牌名称背后的文化与内涵，增加译著的趣味性。﹀
外文摘要：	︿ This research is based on the translation of The Affluent Consumer: Marketing and Selling the Luxury Lifestyle, co-authored by Ronald D. Michman and Edward M. Mazze and published in 2006. Based on years of teaching experiences and numerous case studies, the authors analyze the affluent market in the United States at the turn of the 21th century and its marketing strategies. It reveals the lifestyles and values of the affluent Americans and may shed light on the critique of the Chinese nouveaux riches. Since the translation of brand names is complicated and scantily studied, the multitudinousness of brand names existing in this book becomes one of the key issues that need to be addressed. Through a study of marketing, a comparison of similar translation projects, and the translation of The Affluent Consumer, this research first proposes the codes and principles of the brand name translation. Next, it divides brand names into two categories: brand names with and without official Chinese translations. It goes on to put forward by examples specific translation strategies, such as transliteration, literal translation, free translation, combination and zero translation. It ends with a list of all the brand names in the book according to the above-mentioned categories with detailed annotations. This research is aimed at a deeper study into the translation strategy of brand names in marketing, thus filling up the vacuum in this field. It also reveals the cultural connotations of the brand names while trying to add more delight to the reading experience of the translated version. ﹀
分类号：	H087/TP391
论文总页数：	189
参考文献总数：	38
馆藏号：	017/M2015(513)
公开日期：	2015-05-31

文本类型视角下社会学专著的翻译研究——以Pricing Beauty: The Making of a Fashion Model为例.范冬妮

链接

题名：	文本类型视角下社会学专著的翻译研究——以Pricing Beauty: The Making of a Fashion Model为例
姓名：	范冬妮
学号：	1201210557
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-31
关键词：	文本类型理论纽马克社会学专著翻译研究
论文摘要：	︿时装模特行业是时尚产业的重要支柱之一，璀璨舞台的帷幕之后，还隐藏着很多社会学理论知识。本文的翻译对象Pricing Beauty: The Making of a Fashion Model一书，即以社会学家的视角来看待时装模特的职业生涯。作者阿什莉·米尔斯（Ashley Mears）曾是一线名模，以模特和社会学研究生的双重身份参与观察行业内幕，并将自己的学术发现和研究成果写成了这本博士论文。除专业读者之外，也存在仅因兴趣而选择阅读学术专著的普通读者。对于既面向专业人士又能吸引大众读者的人文社科学术专著，应如何翻译才能兼顾这两类受众群体？该翻译研究问题可以在纽马克（Peter Newmark）的文本类型理论的指导下为文本分类，同时考虑英汉词句差异、读者层次、作者意图和源语言文本目的等诸多因素。纽马克按照语言功能将文本分为表达型文本（expressive text）、信息型文本（informative text）和呼唤型文本（vocative text）。而社会学专著通常会交替出现学术分析性内容和叙事描述性内容。根据文本类型定义，可认为前者属于信息型文本，后者属于表达型文本，再据此选择语义翻译或交际翻译作为翻译方法，同时服务于两种读者群体的阅读需求。本文从术语和用词、被动语态、插入语这三方面入手，探索在纽马克文本类型理论指导下社会学学术专著的翻译策略和技巧，试图解决不同文本类型的兼顾问题。力求做到专业术语准确，用词符合表达型文本语言特点；合理使用语态；平衡译文的流畅和断句，并注意维护口语文本的真实性。通过研究和实例分析表明翻译策略的有效性，期望本文能对文本类型理论在学术翻译中的应用有所借鉴。﹀
分类号：	H087/TP391
论文总页数：	182
参考文献总数：	0
馆藏号：	017/M2015(665)
公开日期：	2015-05-31

基于语篇特征的句子仿拟英汉翻译研究.何令琪

链接

题名：	基于语篇特征的句子仿拟英汉翻译研究
姓名：	何令琪
学号：	1201210594
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-31
外文题名：	An Exploration on the Translation of Parody in Sentences Based on Textual Featuresarody in Sentences Based on Textual Features
关键词：	仿拟翻译句子仿拟语境学语篇特征翻译策略
外文关键词：	parody translation parody in sentences context textual feature translation strategy
论文摘要：	︿仿拟辞格是中英共有的常见修辞手法的一种。由于仿拟形式、类别、长度、位置以及所处文本类型千差万别，因此在翻译实践的过程中，各类仿拟的翻译策略不能一概而论。传统对仿拟翻译的研究主要集中在对新闻标题以及广告语等短语的翻译上，对句子中的仿拟翻译的研究则鲜有人涉足。句子仿拟翻译不同于标题或广告语等短语翻译，需要与上下文联系呼应，与整个语篇形成一个有机整体。作为整个语篇的一个组成部分，句子中的仿拟存在怎样的特殊性，在文中具体的语义如何，怎样翻译合适也需要与语篇互动之后得到结论。本文以语境学为基础分析句子仿拟的特殊性，并创造性地从语篇特征的角度来研究两种不同语言之间句子仿拟翻译的相关问题。本文以《时尚品牌营销：从阿玛尼到飒拉》(Fashion Brands: Branding Style from Armani to Zara)一书的翻译实践为例，对句子中的仿拟翻译策略进行分析，通过对句子仿拟特殊性的研究，探究了传统仿拟翻译方法平移到句子仿拟翻译中的薄弱之处并给出自己的结论。首先，本文总结了仿拟翻译的现有研究状况，针对语境理论对句子中仿拟辞格区别于一般短语仿拟的特殊点进行了分析。经研究发现，句子中的仿拟辞格主要有以下四个特点：其一，主客体文化补偿要求高；其二，追求连贯通顺甚于求新求变；其三，相应对仿体的限制相对较小；其四，多出现衍伸意义，需借用语境确定语义。从而，根据以上特点进一步指出将广告或标题类的短句仿拟翻译方法平移到书籍中的句子仿拟翻译时产生的四种问题：文化补偿手段缺失、阅读流畅性受损、过分直译导致损失原文的本意及一词多义时词义选定困难的问题。然后，根据语篇语言学提出的语篇特征，结合翻译实践过程中积累的经验，提出了句子中仿拟翻译问题可以从可接受性与互文性、语篇衔接与连贯性、意图性和语境性四组角度分析，相应地采取文内外加注翻译策略、直译加注释策略、套译加增译的翻译策略。最后，提出句子仿拟相较传统仿拟翻译策略的改进之法，即“直译+注释”的翻译方法，既照顾到读者对文章文化补偿和流畅度方面的诉求，同时满足了译文须贴近原文、表意明晰的要求。﹀
外文摘要：	︿ Parody, one of common figures of speech, has experienced a long history. Due to the complexity caused by parody’s form, category, length and its location in sentences, translation strategies of parody should impressively differ from each other. However, traditional study of parody translation mainly focused on parody in news headlines and advertisements, rendering the translation study of parody in other areas under-studied. Unlike parodies in terms of news titles and commercials, the translation of parody in sentences needs to conform to the context to fit the text as a whole. As a component part of the whole text, the meaning of parody in sentence and its textual feature can only be according to the context. In this paper, the author will explore the characteristics of parody in sentence based on context analysis and conduct research on sentence parody translation between Chinese and English from the perspective of discourse analysis. Based on the translation practice of Fashion Brand: from Armani to Zara, the paper analyzed sentence parody’s uniqueness in context theory . Firstly, cultural compensation is exceedingly needed to understand it. Secondly, being clear and coherent is more important than being novel when it comes to translation. Thirdly, the form of translation is comparatively free. Fourthly, the context is needed to determine the semantics of parodies with double meanings. Based on what are mentioned above, several problems of traditional translation strategies in adaption in parody sentence translation occurred is further pointed out. Among them are the lack of cultural compensation methods, reading fluency damage, deviation from the original intention and polysemy confusion. Finally, the paper puts forward sentence parody translation strategies under different conditions from four textual feature sets as acceptability, cohesion and consistency, intentionality and contextuality and concludes that literal translation with note is the best strategy. ﹀
分类号：	H087/TP391
论文总页数：	179
参考文献总数：	19
馆藏号：	017/M2015(764)
公开日期：	2015-05-31

英汉翻译时名词化的翻译策略 -以社科文本《欧洲经济史》为例.蒋纬

链接

题名：	英汉翻译时名词化的翻译策略 -以社科文本《欧洲经济史》为例
姓名：	蒋纬
学号：	1101210700
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-31
外文题名：	Translation Strategies of English Nominalization--On the Practice of Social Science book An Economic History of Europe
关键词：	社科类功能对等名词化翻译策略
外文关键词：	Social Science Functional Equivalence Nominalization Translation Strategies
论文摘要：	︿互联网时代,各国之间的经济文化交往空前广泛。由于各国地理环境,历史发展, 社会组织等不同,社科类书籍作为了解世界的一种渠道,越来越受到人们的欢迎。与此同时,社科类书籍大量引进中国,社科书籍的翻译需求不断增加。社科类书籍大量使用名词化形式,这一现象已受到很多研究者的关注。本文从功能对等角度探讨社科英语中名词化的翻译方法。笔者在知网以“功能对等+名词化翻译”为关键词搜索时,发现只有 4 篇论文讨论这个主题,其中 3 篇都是探索科技英语名词化翻译,另外一篇讨论-ing 形式名词化翻译。还没有从奈达功能对等角度讨论社科文本中的名词化翻译。本文以奈达的功能对等理论为指导,讨论翻译书籍欧洲经济史的名词化翻译,首先回顾了学者对名词化的研究,并且回顾了前人对名词化翻译策略的研究。然后借助奈达的功能对等理论的主要观点,指导名词化翻译。接着,在笔者翻译实践的基础上, 就欧洲经济史中的名词化进行分析,并讨论不同的名词化形式。实践调查了名词化翻译过程中的翻译问题。最后提出实现名词化功能对等的六种翻译方法。﹀
外文摘要：	︿ In the Internet era, the economic and cultural exchanges between countries are constantly growing and have reached an unprecedented level. Due to the differences in geographical environment, historical development and social organization, social science books have been a channel for people to understand the world and are welcomed by people. In the meantime, social science books are vigorously introduced to China, and the demand for translation of these books is also on the increase. The phenomenon that social science books have an extensive use of nominalization has been paid attention by researchers. In this paper, the author explores the nominalization translation methods of social science English text from the perspective of functional equivalence. The author searches the keywords "functional equivalence + nominalization translation" in CNKI and only finds four related papers. Among them, 3 papers explores the nominalization translation of scientific English, while the remaining one discusses the translation of a specific kind of nominalization, namely, "-ing". No one probes into the nominalization translation of social science text from the perspective of Nida's functional equivalence theory. In this paper, with Nida's functional equivalence theory as the guidance, the author discusses the nominalization translation in the book An Economic History of Europe. First of all, the existing studies on nominalization and its translation strategies are reviewed. Then, Nida's functional equivalence theory is adopted to guide the nominalization translation process. After that, based on translation practice, the author presents a detailed analysis of nominalization in the abstracted text from An Economic History of Europe, investigates common translation issues in the translation process of nominalization. Finally, the author discusses different forms of nominalization, six translation methods based on functional equivalence of nominalization are proposed. ﹀
分类号：	H087/TP391
论文总页数：	180
参考文献总数：	0
馆藏号：	017/M2015(856)
公开日期：	2015-05-31

针对汉学传统服饰类文本的术语回译研究.李文丽

链接

题名：	针对汉学传统服饰类文本的术语回译研究
姓名：	李文丽
学号：	1201210667
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-31
外文题名：	Terminology Back-translation Study in Sinological Works on Traditional Chinese Clothing
关键词：	汉学研究传统服饰术语术语翻译回译策略
外文关键词：	Sinology study Traditional Chinese clothing terminology Terminology translation Back-translation strategies
论文摘要：	︿本文研究译本《清朝至今的中国服饰》(Chinese Dress: From the Qing Dynasty to the Present)是由长期旅居中国的英国服装设计师和中国传统服饰研究学者瓦雷莉·加勒特所著，并于2008年出版问世。作者以时间为线索，讲述了从清朝到当代中国服饰的演变，其中还穿插服饰相关的民俗介绍。由于中国传统服饰文化的独特性以及中西语言差异，书中大量的传统服饰术语，给翻译造成了难点。本文总结了源文本在服饰术语英译过程中出现的三个方面的问题：术语处理方式混乱、模糊处理和理解错误。结合回译过程，笔者探究了造成这些问题的三个因素：服饰术语自身复杂性、文化可译性限制以及参考资料缺乏，并对比分析了不同类型文本对服饰术语的处理和翻译方式。最后在简洁性和一致性原则指导下，笔者结合具体翻译实例提出“直译+文内注释”，“查证和推断”，“统一术语一致性”的服饰术语回译策略。并在最后附上传统服饰术语对照表，列举原文服饰术语、及对应的汉语概念，以期让同类研究者更能从中发现服饰术语翻译的难点和问题。鉴于传统服饰术语翻译的诸多难点以及复杂性，对于文学文本与非文学服饰专著中的术语表达或翻译方式，笔者并未并未进行优劣评判。仅对其中的服饰术语进行了比较分析，以期为服饰术语翻译研究者、术语制定者或相关领域的研究者提供些许借鉴。﹀
外文摘要：	︿ This research is based on the translation of Chinese Dress: From the Qing Dynasty to the Present, published in 2008 and authored by Valery Garrett, a British fashion designer and scholar who stayed in China over 20 years. It illustrates chronologically the evolvement of traditional Chinese clothing and clothing-related customs. As it contains a large number of clothing terminologies carrying unique Chinese culture, this book poses great challenges to the translator who is not necessarily an expert in the field. This thesis first summarizes the problems and mistakes Garrett makes about clothing terminologies, and looks into and analyzes three factors that causes these problems, i.e. complexity of traditional Chinese clothing terms, limitation of translatability due to cultural differences and lack of reference materials. By comparing with clothing terms in literary translations such as The Dream of the Red Mansion, this thesis further explains the differences and complexities in translating these terms. And it also puts forward three back-translation strategies, namely literal translation+cut-in notes, verification and deduction, and consistency. A table of terminology is attached in the appendix for reference. Given the complexity of clothing terminology, this dissertation aims not to pass judgement on the translation of different genres, but to provide a meaningful perspective for term compilers and other researchers in this field. ﹀
分类号：	H315.9
论文总页数：	260
参考文献总数：	33
馆藏号：	017/M2015(399)
公开日期：	2015-05-31

操纵视角下中国政治文献法译研究——以《习近平谈治国理政》为例.杨慕

链接

题名：	操纵视角下中国政治文献法译研究——以《习近平谈治国理政》为例
姓名：	杨慕
学号：	1301211045
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-31
关键词：	操纵理论《习近平谈治国理政》政治文献翻译
论文摘要：	︿当代翻译理论已经突破了在传统理论框架下的源语言以及目标语言在语言层面的转换。操纵学派认为翻译是一种社会文化行为，该学派的主要代表人物之一Lefevere提出了翻译操纵理论。他认为，在整个翻译过程中，有三大要素起着至关重要的操纵作用，具体来说即为意识形态、诗学以及赞助商。现如今，我国的翻译研究进入了百花齐放的阶段，但这些研究多是针对文学翻译，关于政治文献翻译的研究数量并不多。且已有的研究较少从社会文化的角度出发，大多是关注于语言层面的转换和对等。政治文献的翻译是其他国家了解我国的重要资源，因此，政治文献翻译的研究有着不容小视的意义。本文主要采用描述性研究法，根据Lefevere的操纵理论对《习近平谈治国理政》中文版及法文版进行分析，探寻意识形态、诗学以及赞助商这三要素是如何操纵这一典型政治文献的汉法翻译全过程，并且在研究过程中结合政治文献翻译的特点，发现该理论的局限之处，同时积极探索其它操纵因素所发挥的作用，尝试对该理论实行创新和完善。本文的研究结果表明：（1）宏观上讲，Lefevere操纵理论虽基于比较文学且已有的研究也均应用于文学翻译，但本文通过分析研究政治文献翻译这一游走于文学与权力之间的非文学翻译，证明意识形态、诗学和赞助商对其同样起到了操纵作用，故Lefevere的操纵理论也同样适用。（2）微观上说，通过分析政治文献翻译中这三大要素的操纵痕迹，可以了解政治文献翻译的全过程，以及异化翻译法作为政治文献翻译的主要翻译策略。（3）应用到对政治文献翻译的分析时，Lefevere的操纵理论也存在局限之处。该理论研究的是目标语境中三大操纵要素的负面影响，如意识形态和赞助商可导致译文不忠实于原文，并使得译文向目标语文化中的意识形态靠近。而本文的研究则证明，在政治文献翻译过程的源语语境中，操纵具有积极意义，强大的意识形态和赞助商使得译文尽可能地忠实于原文。（4）综合来讲，本论文除了研究Lefevere提出的三要素之外，还研究了译者个人的操纵这一因素，比如其主观能动性，以及译者隐性和显性的辩证统一，从而发现除意识形态、诗学和赞助商这三大操纵因素之外，译者的创造性操纵也是影响翻译过程的重要因素，译者个人对政治意识形态的认同及其翻译能力是导致译文水准以及是否忠实于原文的决定性因素。﹀
分类号：	H087/TP391
论文总页数：	75
参考文献总数：	0
馆藏号：	017/M2015(409)
公开日期：	2015-05-31

从图文关系看纪录片解说词的翻译——以《自然世界》翻译为例.姚传云

链接

题名：	从图文关系看纪录片解说词的翻译——以《自然世界》翻译为例
姓名：	姚传云
学号：	1201210942
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-31
关键词：	纪录片解说词翻译图文关系画面和解说关系
论文摘要：	︿画面和解说是纪录片的两个重要组成部分，以往的解说翻译研究中，往往只关注解说词这一语言符号本身，对解说词和画面的关系关注较少。为了弥补前人研究中的这一缺陷，本文归纳了纪录片画面和解说的关系，并分析其对翻译的指导作用。希望引起翻译研究的学者对不同类型符号间相互关系的关注，也希望这篇报告能够给相关领域的学者或者纪录片乃至影视作品翻译的译者提供一个新的角度及一些启发。本文选择了 BBC 著名自然类系列纪录片《自然世界》 (Natural World) 中的 10 集进行翻译，主题上涵盖不同物种的动物；地理上覆盖全球各地；内容上广度深度各不相同；比较全面地反映了此类型纪录片的特点。本文首先整理了国内外字幕翻译的研究，指出现有研究的不足，即对语言符号和其他符号间关系关注不够，进而确定本文将从图文关系视角研究解说词翻译。然后，本文梳理了前人对图文关系的研究，明确各种图文关系研究方法及成果的优劣之处，明确本文将运用系统功能语法中小句关系来研究纪录片画面与解说的关系。之后，笔者分析了纪录片画面这种动态影像的特点、回顾了小句关系理论、分析了其对本文研究的适用性并总结了纪录片画面和解说的 9 种关系。最后，笔者按照画面和解说关系的类别，逐一分析了每种关系对译文的影响和对翻译实践的指导作用，并得出以下结论：画面和解说的关系可以指导纪录片解说词翻译实践，不同类型的关系对译文的要求不同，相对应的翻译方法和策略也不尽相同。说明关系要求译文明确二者共同传达的主题及侧重点所在；阐释关系要求译文明确第一小句的主题，同时使第二个小句能顺畅衔接；例证关系要求译文明确这种关系；延展关系中，无论是附加关系还是变化关系，均要求译文令画面和解说衔接顺畅；增强关系要求译文体现画面和解说的空间、时间、因果-条件和方式关系。详述关系可从明确画面和解说的主题以及二者的不同之处入手，延展关系可从第二个小句中的新信息入手；采用的翻译策略较为灵活，可采用改变将画面信息转成解说、省略画面已传达的信息、从反面表达等策略。本文中的示例均来源于笔者的翻译实践，可信度高，说服力较强。﹀
分类号：	H087/TP391
论文总页数：	295
参考文献总数：	36
馆藏号：	017/M2015(443)
公开日期：	2015-05-31

基于日语单语语料库的中日同形词翻译实务应用研究.李瑞鹏

链接

题名：	基于日语单语语料库的中日同形词翻译实务应用研究
姓名：	李瑞鹏
学号：	1201210659
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
论文答辩日期：	2015-05-31
关键词：	单语语料库同形词翻译实务
论文摘要：	︿汉语和日语中有许多形态相同的同形词，这些词汇使中日学生在学习对方国家语言时，貌似轻松了许多。不过，在漫长的岁月中，这些词语在两国语言中的意义和用法等发生了很大的变化，这给两国的语言学习者及翻译工作者造成了许多困扰。当前，中日同形词的误译情况十分普遍，这是汉日翻译工作中的一大难点。前人在中日同形词翻译问题的解决策略上，常常提出勤查词典、多积累等较简单建议。本文创造性地将日语单语语料库引入同形词翻译，从全新角度解决该问题。笔者就日语单语语料库对中日同形词翻译是否有用与怎么用的问题，进行了深入探讨和研究。首先从理论上，探讨了日语单语语料库应用于中日同形词翻译的可行性。其次将其与其他电子翻译查询工具进行对比，证明将日语单语语料库应用于翻译实务不仅是可行的，而且是必要的。本文将中日同形词的误用、误译情况进行了系统的梳理和具体分类。通过对比、举例论证、量化分析等方法，详细探讨了在翻译实践中，日语单语语料库在解决中日同形词搭配等方面的作用。并使用日本最权威的现代日语书面语语料库和日本女性杂志等语料库，通过NLB、中纳言和「ひまわり」等语料库检索工具，对本文整理出来的同形词翻译问题进行了详细研究和逐一解答，并通过具体实例进行了探讨。研究结果表明，无论是日译汉还是汉译日，使用日语单语语料库都有助于解决中日同形词翻译的问题，提高翻译的质量，并且效果比较显著。﹀
分类号：	H087/TP391
论文总页数：	62
参考文献总数：	0
馆藏号：	017/M2015(462)
公开日期：	2015-05-31

饮食文化研究中特殊词汇的翻译策略—基于《食物发展史》翻译实践.夏锁

链接

题名：	饮食文化研究中特殊词汇的翻译策略—基于《食物发展史》翻译实践
姓名：	夏锁
学号：	1201210878
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-31
外文题名：	The Translation Strategies for Translation of Food-related Words - Based on the Translation Practice of Food in World History
关键词：	饮食文化研究特殊词汇翻译翻译策略
外文关键词：	Food Studies Translation of Food-related Words Translation Strategies
论文摘要：	︿本文探讨内容所依据的译本《食物发展史》作者是杰弗里·皮尔彻，第一版于2005年出版。书中概括性地介绍了世界主要食物发展历程以及在此过程中，饮食与政治、经济和文化之间相互影响的关系，对不同历史阶段及不同国家或地区之间的饮食发展及其影响进行了对比，提出了很多原创性观点，既适合专业研究者作为学术参考，也可供普通读者阅读。书中用了大量与饮食有关的词汇，而由于中外在食物原材料、制作手段以及文化方面拥有较大差异，很多饮食词汇在汉语中没有约定俗成的对等译法，从而成为翻译此类书籍的难点所在。本文根据《食物发展史》一书的翻译实践，将饮食书籍中的特殊词汇分为三大类：食物名称、植物名称和文化负载词。根据这些特殊词汇的特点，笔者提出了翻译需要遵循的三大原则：经济性原则、主要特征原则和受众为主导原则，并在此三大原则的基础上，给出了相应的翻译策略：针对食物名称，运用汉语造词法进行创造性翻译，具体包括利用典型特征和类属词、音译+注释、灵活运用上位词和下位词；针对植物名称，需要尊重词汇的时代性、从上下文语境出发、进行推敲和验证；针对文化负载词，进行回译、发挥译者主动性。在探讨中，结合具体实例对这些策略进行分析。希望笔者的这些探究可以对其他此类书籍翻译的译者有所助益。由于饮食的研究离不开社会和文化，本文在对译例的探讨过程中加强了背景知识的介绍，旨在增强译者对文化和历史背景的了解，促进中西在饮食文化领域的学术交流。﹀
外文摘要：	︿ This paper is based on the translation practice of Food in World History, authored by Jeffrey Pilcher and published in 2005. It explores the relationship between food, politics, economy and human beings through the study of the expansion and development of food in different countries and areas. It also compares the culinary cultures of these countries and areas in different times. Numerous case studies are provided for illustration. In fact, the book presents the whole history of the world from the angle of cuisine. The ideas presented are intriguing and inspiring. It is a book suitable for professional researchers as well as common readers who are interested in culinary culture. This paper discusses three types of special words frequently encountered in such works of food studies: dish names, plant names and culture-loaded words. It then puts forward three principles of economical wording, of highlighting main features of the word to be translated and of evaluating reader’s demands. Besides, different translation strategies are analyzed with examples to illustrate the above three types of words: for dish names, borrowing the methods of word creation is recommended; for plant names, translating based on the historical features, the context and the connotation of the word is suggested; for culture-loaded words, using back translation and flexible translation is recommended. Since food studies are correlated with societies and cultures, this paper pays special attention to the background information in the analysis of these special words. Hopefully, the paper can help other translators to deal with the translation of such food-related words. ﹀
分类号：	H087/TP391
论文总页数：	163
参考文献总数：	46
馆藏号：	017/M2015(517)
公开日期：	2015-05-31

英语商业广告汉译策略分析 ——以《1852-1958年，百则最优广告》为例.由晓菲

链接

题名：	英语商业广告汉译策略分析 ——以《1852-1958年，百则最优广告》为例
姓名：	由晓菲
学号：	1201210954
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
论文答辩日期：	2015-05-31
关键词：	商业广告意图广告语言翻译策略
论文摘要：	︿本翻译的研究译本题为《1852-1958年，百则最优广告——广告人和他们的那些事儿》，作者是美国广告研究者朱利安·莱维斯·沃特金斯。该书挑选了1852-1958年间颇有影响力的广告113则，阐述了广告的创作人、创作背景和广告产生的效果。本译本节选了其中具有代表性的一些广告。本文基于《百则最优广告》的翻译体验，首先指出商业广告是一种特殊的实用文体，包含明确的意图，即传达信息、促使受众做出相应的行动，取得广告的预期效果。同时，商业广告文本注重创意，以此打动消费者，语言运用灵活。其次，本文对中英文广告的语言分别从词法、句法、修辞、社会文化和媒介角度进行了对比分析。随后对书中涉及到的英语商业广告提出了两大翻译原则，即深入分析广告的意图和效果、重视中英文广告的语言特点。在上述原则的指导下，笔者对书中涉及到的广告进行分类归总，针对不同的广告语分别提出“语义翻译”、“传意翻译”、“再创翻译”和“增补翻译”四类翻译策略。本文旨在运用相关翻译策略使翻译后的中文广告能够取得和英文广告相同的广告意图和效果。﹀
分类号：	H087/TP391
论文总页数：	158
参考文献总数：	0
馆藏号：	017/M2015(541)
公开日期：	2015-05-31

海外华侨华人研究中术语翻译探究——以《新美国华侨华人社会：阶级、经济和社会等级》为例.谢润超

链接

题名：	海外华侨华人研究中术语翻译探究——以《新美国华侨华人社会：阶级、经济和社会等级》为例
姓名：	谢润超
学号：	1201210887
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-31
外文题名：	Study on the Term Translation in Overseas Chinese Studies with the New Chinese America: Class, Economy, and Social Hierarchy as an Example
关键词：	海外华侨华人社会研究术语术语翻译翻译策略
外文关键词：	Term Translation Translation Strategy Overseas Chinese Study
论文摘要：	︿本次翻译实践所选书籍为《新美国华侨华人社会：阶级、经济和社会等级》（The New Chinese America: Class, Economy, and Social Hierarchy, 2009）。本书由著名的美国亚裔文化知名研究专家赵小建主编，已于2010 年由美国新泽西路特格斯大学出版社出版。该书从一个全新的角度剖析了当代美国华侨华人社会的内部运作及其社会关系。该书对重新审视美国华侨华人社会，深化和丰富美国华人研究颇有启示。目前本书在国内尚无译本，本次翻译实践节选了本书前四章节。本书的最显著的特点之一是存在许多关于美国华侨华人的术语。本文基于翻译实践的体验，针对海外华侨华人研究领域术语的处理提出三条主要原则，即保持与作者的学术立场一致、紧密结合原文语境和适当信息补偿。在上述原则的指导下，笔者将术语分为规范性术语、非规范性术语。规范性术语指单义性属性较为突出的术语，包括专业的人名、地名、机构名称、历史事件名称等。非规范性术语可划分为模糊术语、近似术语与新兴术语。笔者进一步提出相应的翻译策略，即参照标准、比较与选择、注释法、连缀法与合理创造。并通过翻译实践的实例对策略进行阐述，证明策略的有效性。最后，本文对书中术语进行详细地梳理，予以归类，以期为同类书籍的翻译提供借鉴。鉴于翻译书籍的学术性质，本文在对翻译实例的分析过程中，尤其注意对该领域内专业知识的详细说明，旨在增强译者的专业背景知识，为加强海外华侨华人领域的中西学术交流做出贡献。﹀
外文摘要：	︿ This translation project is based on The New Chinese America: Class, Economy, and Social Hierarchy, authored by Zhao Xiaojian and published in 2010 by Rutgers University Press. The book analyzes the internal operation and social relations of the contemporary Chinese America, with an emphasis on the undocumented immigrants. During this project, the first four chapters were translated. Owing to the differences between Chinese and Western languages, culture, value and disputes in term definitions, many terms cannot be directly translated into Chinese and therefore pose a great challenge to the translation of this book into Chinese. This paper first proposes three principles of translating these terms with an emphasis on the author’s academic position and the context as well as the theory of appropriate information compensation. Next, the thesis divides the terms into the following categories: standard terms, including, names of places, historic events, official documents, etc. and non-standard terms, including fuzzy terms, similar terms and new terms. Then it goes on to put forward specific translation strategies, such as conforming to the standard expression, comparing and choosing, method of annotation, application of clustered definitions and making reasonable inventions. Given the academic nature of the translated text, the present thesis pays particular attention to supplying professional information about study on overseas Chinese through its analysis of terms for the enhancement of the translators’ expertise. The original contribution of this paper lies in that it offers strategies for the translation of difficult terms as well as supplying elaborate term lists for both readers and translators of texts on overseas Chinese study. ﹀
分类号：	H087
论文总页数：	170
参考文献总数：	0
馆藏号：	017/M2015(777)
公开日期：	2015-05-31

新媒体环境中新闻编译策略探究——以界面新闻《眨眼间》编译项目为例.王付娇

链接

题名：	新媒体环境中新闻编译策略探究——以界面新闻《眨眼间》编译项目为例
姓名：	王付娇
学号：	1201210816
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-31
外文题名：	The Study on News Trans-editing Strategy in New Media Environment: Based on the Example of Blink Project of Jiemian News
关键词：	新媒体编译界面新闻零度偏离
外文关键词：	New Media Trans-editing Jiemian Media Zero-Deviation Theory
论文摘要：	︿新媒体是近年来新兴的一种媒体形式，国内外对其尚无准确定义。有人认为所有应用新兴互联网技术生产内容的媒介都叫新媒体，有人认为新媒体是与传统媒体对比的情况下得出的。不管怎样，新媒体给整个媒体行业带来的变革已经突显。作为互联网时代的新力量，新媒体平台更加多样化、扁平化，读者与媒体的互动更方便，媒体原先掌握话语权的垄断地位已不复存在。在这种时代背景下，由此衍生出的新媒体平台上的编译问题值得探究。很多新媒体仍然一味地照搬传统媒体的编译模式，无论从读者还是译者的角度，这显然已经不符合新媒体的特征。新媒体环境中产生了哪些新的语言特征？有哪些因素影响了新媒体编译？新闻编译该如何适应新媒体环境带来的新特征？本文重点结合语言学理论“零度偏离”，并使用了动态流通语料库（DCC）探讨了当下新媒体环境中出现的语言偏离现象。与此同时，分析了偏离现象产生的原因是受到用户心理、需求等各方面的改变，并以新媒体界面新闻为例，分析了新闻机构特点和编译流程对新闻编译产生的影响。在具体策略的问题上，本文选取了100篇新媒体环境中的编译新闻，包括新浪、网易、腾讯等，与笔者实习期间参与编译的新媒体“界面”《眨眼间》栏目的译本做对比，研究语料共计9万字。所有新闻语料均从2015年选取，具有很强的时效性。最后从语音、词汇、语法、修辞、语篇和副文本等维度探讨了提高新媒体编译正偏离具体可行的策略。最后一章采用调查问卷的验证方法对该策略进行了验证，真实了解新闻编译人员的工作现状，确保了提出策略的可靠性和实用性。﹀
外文摘要：	︿ As newly created media form, New Media has not been well defined yet. What is definite is that the appearance of new media has brought about great changes to the entire media industry. As a quickly expanding strength in the Internet era, new media connects and interacts with readers more closely than old media and breaks the latter’s monopoly by giving readers, or rather users or consumers. Given this, the increasing problems about trans-editing on the new media platform need further research. Some so-called “new media” entities still use traditional trans-editing methods, copying them directly to their new platforms. This is obviously not suitable for the new media environment. What are the characteristics of new media’s language? What are impacting factors? How should trans-editing adjust to new media’s development? This paper has mainly taken advantage of Zero-Deviation Theory to analyze the deviation phenomenon of new media language by using the Dynamic Circulating Corpus (DCC). Meanwhile, some reasons have been given to explain this phenomenon, such as user’s needs and psychology has changed in the new media environment. This paper also take Jiemian News as a case, to analyze how editing procure and news agency could impact the trans-edtiting process. In order to answer these questions, this research selects 100 passages from the reports by Sina, Netease, Tencent and Jiemian where I did the internship to analyze trans-editing problems in the new media environment. The total number of words taken from these materials amounts to 90,000. The paper puts forward some trans-editing strategies from the perspective of pronunciation, words, rhetoric, discourse and para-texts. A questionnaire has been made in the last chapter to check the usability and reliability of trans-editing strategy put forward in the paper. ﹀
分类号：	H087/TP391
论文总页数：	60
参考文献总数：	41
馆藏号：	017/M2015(786)
公开日期：	2015-05-31

汽车制造现场口译的难点分析及应对策略研究——以北京奔驰汽车有限公司喷漆车间为例.蒋博

链接

题名：	汽车制造现场口译的难点分析及应对策略研究——以北京奔驰汽车有限公司喷漆车间为例
姓名：	蒋博
学号：	1201210625
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	张宏岩
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-31
外文题名：	On Interpretation Difficulties and Strategies in Automobile Manufacturing Shop Floor: A Case Study of Paint Shop in Beijing Benz Automotive Corporation
关键词：	汽车制造现场口译难点分析任务型学习
论文摘要：	︿随着中外合资汽车制造企业的蓬勃发展，汽车制造领域专业口译人员的需求缺口日益增大。近年来，虽然口译专业毕业的人才越来越多，但通过自身的实习经历发现，对于像汽车制造现场口译这样专业化程度极高的现场口译任务，译员一般很难在短时间内适应并完全胜任。同时，通过文献阅读发现，针对汽车制造这一具体领域的口译文献研究寥寥无几。而目前中外口译的研究趋势是由对口译结果和口译理论的研究向口译过程和口译实践的研究转变。因此，研究汽车制造现场口译具有现实和理论意义。文章基于调研对汽车制造现场口译的难点进行详细分析,根据难点分析结果并结合前人对任务型学习法在口译领域的应用研究成果，设计出一套完整且具体的汽车制造现场口译难点应对策略，为口译员提供帮助。文章首先阐述了行业研究背景和理论研究背景，研究内容和重点，研究价值，研究方法并概述了论文的结构。然后进行了文献综述，对口译、联络口译、口译学习以及口译质量评估的相关理论知识以及任务型学习法在口译领域的应用进行了研究总结。接下来根据对汽车制造现场口译员和译员使用者的访谈以及问卷调查从五个方面分析了汽车制造现场口译的难点。在难点分析的基础上，结合任务型学习法在口译领域的应用，设计出一套完整的任务型汽车制造现场口译学习法，并以北京奔驰汽车有限公司喷漆车间为例从任务前、任务中和任务后详细阐述学习法的具体实施要点。最后进行了总结和展望，总结了文章的重点内容，指出了不足之处，并对未来相关主题的研究做出了展望。﹀
分类号：	H087/TP391
论文总页数：	91
参考文献总数：	0
馆藏号：	017/M2015(569)
公开日期：	2015-05-31

移情视角下的非文学文本翻译研究——以主题为”Doing Business in China”的多著作翻译实践为例.姜楠

链接

题名：	移情视角下的非文学文本翻译研究——以主题为"Doing Business in China"的多著作翻译实践为例
姓名：	姜楠
学号：	1201210622
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	张宏岩
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-31
外文题名：	Translation Study on Non-literary Text under Perspective of Empathy – Multi-Book Translation Practice on the Theme of “Doing Business in China”
关键词：	移情视角非文学文本翻译方法
外文关键词：	Empathy Perspective Non-literary Text Translation Method
论文摘要：	︿翻译报告的研究对象是主题为“Doing Business in China”的四本书目的核心章节，四本书的主要目标是向有意愿到中国经商的外国商人提供指南。这四本书目均涉及到对中国历史风俗和商业、政治环境等的描述，并对中西方思维及处事方式作了对比描述并给出相应建议。文章语言具有传递信息、提供指南的非文学文本的特性，整体而言语言平实、理性，但不同作者作为不同的情感个体难免流露出不同的感情倾向。在阅读和翻译过程中笔者发现作者因其各自不同的背景及经历而对中国的倾向性态度不一，或积极支持，或谨慎中立或稍有批判。为将原文含义能够原汁原味地传递给译语读者，翻译报告从移情角度的视角下进行研究，思考如何在“克服自我”和“自我发扬”阶段与原作者及译语读者进行准确移情。翻译报告中通过把握不同书目的情感倾向，从全书阅读及局部翻译的角度提出了考虑移情程度的因素，并给出了译例证明了所采用的翻译方法。对于无情感色彩的描述性原文，采用忠实原文的直译方法；对于具有情感色彩的原文，若直接直译并不影响译语读者主观能动的阅读体验，则采用直译方法，若直译时遇到情感倾向判断时，则以全书的情感倾向为指导进行移情翻译，从而实现忠实于原文的翻译；对于直译不能够准确表达译者移情感受的原文，则需要译者对原文做出解释或补充，采用意译的翻译方法。﹀
外文摘要：	︿ This is a report of translation practice on main chapters of four books on the theme of “Doing Business in China”. These books are manuals for people who want to do business in China. The texts are telling people the differences that foreign people will encounter in China in aspects of behave style, custom, business and political environment, and the way of thinking. Authors of the four books have different backgrounds and experiences and they are found having different attitudes toward China. One loves China very much and encourages people to go to China, one is some kind of opposed to China, and two are objective when describing China. The varied attitudes can be told in books despite of their essence of non-literary text. I must take the different attitudes into account during the process of translation. I analyze the attitudes of four books from the perspectives of a whole and parts, and take translation examples to prove the impact the attitudes brings on partial translation from the perspective of empathy. For sentences which consist of objective information, I translated them with literal translation method; for those which express emotion and the metaphor in sentences can be understood by Chinese readers, I translated them with literal translation method; for those which express emotion but need the judgement of emotion orientation, I translated them considering the attitudes of whole books; for sentences which need the translator’s explanations, I translated them in free translation method. ﹀
分类号：	H087/TP391
论文总页数：	282
参考文献总数：	15
馆藏号：	017/M2015(660)
公开日期：	2018-05-31

种族歧视词汇空缺的翻译策略研究——以《城中之城：密歇根州大急流城的黑人自由斗争》为例.雷阿芳

链接

题名：	种族歧视词汇空缺的翻译策略研究——以《城中之城：密歇根州大急流城的黑人自由斗争》为例
姓名：	雷阿芳
学号：	1201210638
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2015-05-31
外文题名：	Translation Strategies of Lexical Gaps on Racial Discrimination——Based on the Translation of A City Within a City: The Black Freedom Struggle in Grand Rapids, Michigan
关键词：	种族歧视词汇空缺翻译原则翻译策略
外文关键词：	Racial Discrimination Lexical Gaps Translation Principles Translation Strategies
论文摘要：	︿自建国以来，美国虽然长期存在着种族歧视，却在各方面的共同努力下基本实现了种族融合的目标。以人为鉴，可以明得失；以史为鉴，可以知兴替。因此，美国的种族关系史和种族政策也成为国内民族关系领域学者等学习探讨的对象。然而，由于中美两国种族文化方面的差异，种族歧视词汇空缺现象却成为这些学者了解美国种族关系的一大障碍。笔者基于《城中之城：密歇根州大急流城的黑人自由斗争》（A City Within a City: The Black Freedom Struggle in Grand Rapids, Michigan）的翻译实践为处理种族歧视词汇空缺现象提供一些思路。根据是否存在相关译法的原则，笔者将种族歧视词汇空缺分为已有约定俗成译法的词汇空缺、存在争议译法的词汇空缺和没有常见译法的词汇空缺三类。接着，笔者结合本次翻译实践，总结了种族歧视词汇空缺的翻译原则，即考究历史事实、深挖文化内涵，并以此为指导提出了参考现存译法策略、概念直引策略、语义阐释策略、合理加注策略以及灵活处理策略的翻译策略并辅以译例证明策略可行性。此外，笔者还从主观方面和客观方面详细分析了选择种族歧视词汇空缺翻译策略的影响因素，并指出译者应综合考虑这些影响因素，选择最恰当的翻译策略，才能保证译文真实再现原著的文化内涵。目前对美国种族关系的研究多停留在民权运动史和种族政策等方面，对种族歧视词汇空缺的研究却很少有人涉及。笔者希望此文能够引起学术界对该领域词汇空缺的研究，为相关领域学者更好地了解美国种族关系史和种族政策提供帮助，同时也为日后译者翻译美国种族问题的学术性文本提供一定的借鉴。﹀
外文摘要：	︿ Although racial discrimination has been long in existence in America since its independence, today's America has achieved overall integration with consistent efforts of the whole nation. From other countries and their history, we can learn their lessons and develop better ethnic relations in our country, therefore, the racial relations history and racial policy of America have been the focus of many Chinese scholars specialized in ethnic relations. However, due to differences on racial culture of China and America, lexical gaps on racial discrimination have been one of the obstacles for these scholars to pursue their studies. Based on the translation project of A City Within a City: The Black Freedom Struggle in Grand Rapids, Michigan, this paper provides some insights on how to deal with lexical gaps on racial discrimination. First, this paper divides these lexical gaps into three types based on whether it has been translated, that is lexical gaps with uniform translation, lexical gaps with controversial translations, and lexical gaps with no common translations. Next, this paper puts forward two principles of translating these lexical gaps with an emphasis on historical facts and cultural connotations. Then it goes on to propose specific translation strategies with respective illustrations, such as referring to existed translations, borrowing concepts, paraphrasing, adding reasonable annotations, and making flexible adaptations. Last, this paper analyzes factors that influence the choice of translation strategies from the subjective aspect and objective aspect and concludes that only by considering all the factors together can translators choose the most suitable strategies and output the best translation. So far, the researches on American racial relations mostly focus on civil right movements and racial policies, with few studies on lexical gaps on racial discrimination. This paper provides some new insights on this field, in the hope that it will stimulate further research on it, provide reference on future translation of such academic texts, and facilitate the studies on America racial relations and racial policies and eventually help develop better ethnic relations in China. ﹀
分类号：	H087/TP391
论文总页数：	221
参考文献总数：	51
馆藏号：	017/M2015(458)
公开日期：	2015-05-31

第一人称代词“we”在经济学英语文本中的汉译策略研究—以《经济动态的计算方法》为例.徐翔

链接

题名：	第一人称代词“we”在经济学英语文本中的汉译策略研究—以《经济动态的计算方法》为例
姓名：	徐翔
学号：	1201210902
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2015-05-31
关键词：	第一人称代词we 经济学文本翻译策略
论文摘要：	︿在经济全球化背景下，经济学领域的研究文献呈井喷式增长，由此也带来了经济学文本翻译的巨大需求。我们在阅读和翻译经济学文献时不难发现，英语中的第一人称代词复数形式“we”作为重要的人称指示语之一，在经济学文本中出现的频率很高，而其它人称代词出现的频率则相对较低,并且，“we”在该类文本不同语境中的意义也不尽相同，在英汉翻译转换时呈现一定特点，若不加以甄别的直译，将影响译文效果。为此，笔者试图通过统计分析方法对其特点和规律进行总结，并运用相关翻译理论来探究经济学文本中“we”的翻译策略。本文结合翻译项目《经济动态的计算方法》，着重探究了以下四个方面的问题：其一，综览中外学者关于第一人称代词与学术性英语的研究成果，兼析了经济学学科属性和经济学英语文本的特点，借鉴前人有关理论研究成果，为研究第一人称代词“we”在经济学英语文本中的用法及基本规律提供理论支撑。其二，探讨了英语第一人称代词复数形式“we”的用法，包括常规用法和非常规用法。其中，非常规用法包括泛指、转指、代指和空指等四种用法。其三，采用语料统计法讨论了“we”的常规用法和非常规用法在经济学英语文本中的分布情况，以及 “we”在经济学英语文本中的隐化现象，尝试分析影响“we”翻译的主要因素。讨论中所引用的语料包括翻译项目《经济动态的计算方法》和北京大学中国语言学研究中心（CCL）双语平行语料库。其四，基于对第一人称代词“we”在经济学英语文本中用法的探讨，归纳并提出“we”的直译法、角色替代法、具体指代法、省略法等四种翻译策略，并引用实例依次进行分析和论证。简而言之，笔者试图探究“we”在经济学英语文本中的使用方法和翻译策略，以期对相关翻译人员有所裨益。﹀
分类号：	H087
论文总页数：	191
参考文献总数：	33
馆藏号：	017/M2015(558)
公开日期：	2015-05-31

管理类书籍中祈使句的翻译研究——以《翻译服务管理》为例.连昭

链接

题名：	管理类书籍中祈使句的翻译研究——以《翻译服务管理》为例
姓名：	连昭
学号：	1201210685
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2015-05-31
关键词：	祈使句管理类书籍祈使句翻译
论文摘要：	︿管理类书籍属于社科类文本，旨在传播管理知识和技能，语言简练、用词准确、逻辑清晰。在管理类书籍中，作者除了使用陈述句描写客观事实外，还借助大量的祈使句给予读者建议和提醒。对比口语、公示语及说明文档等文本中的祈使句，管理类书籍中的祈使句具有句式长、逻辑复杂等特点。中英文祈使句在主语、谓语、句型结构等方面均存在差异，如果在翻译时不予考虑与处理，译文会出现过度欧化、不符合汉语表达习惯等问题，有时甚至会伤及读者的面子。在管理类书籍中，祈使句作为作者表达建议、提醒的语言之一，其翻译质量关系到信息传播的效果。根据纽马克文本分类理论，管理类书籍主要由信息型文本和呼唤型文本构成，应采用交际翻译法。首先，笔者在前人研究基础上明确了祈使句的定义，并从语用功能和语法结构维度，将管理类书籍中的祈使句分为四类：1）动词原形引导的肯定祈使句；2）“Don’t”引导的否定祈使句；3）“Let’s”引导的特殊祈使句；4）“if/unless”引导的条件祈使句。其次，通过对比口语、公示语及说明文档中的祈使句，总结出管理类书籍中的祈使句在句型结构、句子成分及语用功能上的特征。再次，结合英汉祈使句在主语、谓语、句式及语气上的异同点，归纳出管理类书籍中祈使句的翻译原则：简明原则和礼貌原则。在此基础上，笔者总结出5种最佳翻译策略，包括化整为零，突出重点信息；去除冗余，避免过度欧化；增补代词，明确作者意图；整合句式，加强逻辑衔接；反说正译，保持原文风格。最后，笔者通过援引《翻译服务管理》一书中的实例对翻译原则和策略进行阐述，同时以问卷调查的形式从读者的视角验证了其有效性，以期为管理类书籍中祈使句的翻译工作提供借鉴。﹀
分类号：	TP391
论文总页数：	210
参考文献总数：	0
馆藏号：	017/M2015(601)
公开日期：	2015-05-31

插入语的翻译策略研究——以《文化与帝国：数字革命》为例.翁敏

链接

题名：	插入语的翻译策略研究——以《文化与帝国：数字革命》为例
姓名：	翁敏
学号：	1201210863
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2015-05-31
关键词：	插入语对比翻译策略
外文关键词：	Parenthesis Comparison Translation Strategy
论文摘要：	︿在英语中，插入语是常见的语法现象。古代汉语中并没有插入语，而现代汉语受到西方文化和翻译的影响，才逐渐开始使用插入语。尽管如此，由于相同语义条件下的插入语在英汉两种语言中并非完全对应，所以笔者翻译的《文化与帝国：数字革命》一书中大量出现的插入语现象对翻译项目造成了不小的困难，从而引起了笔者对插入语翻译的研究。本文阐述了国内外对插入语翻译的研究情况，指出前人研究工作中所取得的成果和不足之处。提出本文的翻译目的为最大程度地减少中英插入语的差异对目的语读者造成的阅读干扰，尽量使译文贴近中文的表达方式，保证译文的连贯性，减少翻译腔。在此目的下，将插入语分为三种形式：双括号、双破折号和双逗号插入语，分别进行了中英文用法的对比。由于汉英语言的差异，笔者认为在准确地传递原作内容的情况下，译文中插入语符号的使用应当根据其具体用法和目的语标点符号的使用规则做出保留或改变，而不应对原文中的插入语符号进行生搬硬套。因此，在分析了插入语符号在英汉两种语言中的功能和用法之后，本文以对比结果作为指导，分别讨论了这三种形式下的插入语翻译策略：翻译双括号插入语时，可以保留括号形式，根据译文的实际情况改变括号插入语的位置；翻译双破折号插入语时，表示强调作用的插入语保留双破折号，在其他情况下，根据插入语和句子的衔接度，采用融合重组、将双破折号转换为单破折号和括号的策略；翻译双逗号插入语时，保持双逗号形式不变或者将其与其他成分重新组合，融入句子中。在讨论翻译策略时，笔者还列举了同类已出版书籍中的译例与本文译例进行对比分析，最后以问卷调查的形式验证本文提出的翻译策略的有效性。希望这些翻译策略能够在其他译者处理插入语的翻译时有所启示。﹀
外文摘要：	︿ In English, parenthesis is a common grammatical phenomenon. However, in ancient Chinese, there is no parenthesis, which started to appear in modern Chinese under the influences of western culture and translated books. Even so, the parenthesis in English and that of Chinese are not exactly equivalent even under the same semantic condition. A lot of parentheses are used in Culture and Empire: Digital Revolution, which brings no small amount of difficulties in this translation work, therefore, how to translate the parentheses from English to Chinese in a better way and summarize effective translation strategies become the focus of this paper. This paper elaborates researches at home and abroad on the translation of parenthesis, and points out the achievements and shortcomings of these researches. The purpose of this paper is to propose the translation purpose of minimizing the interferences to the Chinese readers caused by the differences between English and Chinese parentheses, and to provide coherent translated texts that read like original Chinese and with less trace of translation. In this purpose, this paper analyzes the differences in the usage of the three kinds of parentheses (a pair of brackets, a pair of em dashes and a pair of comma) between Chinese and English. On account of the differences between Chinese and English, this paper considers that under the premise of expressing the source content accurately, the punctuation marks of parenthesis in target language should vary according to the specific context and usage rules instead of applying mechanically. With the results of the comparison of functions and usages of parenthesis between Chinese and English, this paper continues to discuss the translation strategies for those three kinds of parentheses, when translating the parenthesis with a pair of round brackets, the translator can keep the round brackets and decide whether to change their position according to the context; when translating the parenthesis with a pair of dashes which play the role of emphasis, the translator can keep the dashes. In other cases, the translator can merge the parenthesis into the sentence, or change the dashes into one dash or round brackets; when translating the parenthesis with a pair of commas, the translator can keep the commas or merge the parenthesis into the sentence. In chapter five, a questionnaire survey is conducted to prove the feasibility of the translation strategies proposed by this paper. This paper hopes these translation strategies can help other translators dealing with the translation of parenthesis. ﹀
分类号：	H087/TP391
论文总页数：	165
参考文献总数：	0
馆藏号：	017/M2015(613)
公开日期：	2015-05-31

基于语篇的英汉译文重构分析—以John F.Kennedy的汉译为例.胡蓉

链接

题名：	基于语篇的英汉译文重构分析---以John F.Kennedy的汉译为例
姓名：	胡蓉
学号：	1201210604
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2015-05-31
关键词：	语篇语篇语言学语篇特征译文重构
论文摘要：	︿本文主要探讨语篇特征理论在英汉译文重构中的运用。从传统的翻译模式来看，对译文的分析主要集中在译文与原文字句上的比较，在词句这两个层面上探寻翻译规律，评判是否遵循“信，达，雅”标准，并总结翻译技巧。但随着翻译研究的不断发展，从词句层次来判定译文的成败已经不能解决翻译过程中出现的所有问题，因此也就出现了翻译的语篇语言学研究。翻译的过程就是跨文化交际的过程，原语与译入语的语篇结构和文化背景差异等因素都可能导致交际障碍的产生，所以，在译文分析的过程中注重语篇分析是非常必要的。本文首先对基于语篇进行译文重构的必要性进行了研究，笔者采取了反向思考的方法分别从基于词汇层面和基于句子层面翻译的局限性入手来反证语篇分析的必要性。词汇层面笔者主要考虑的是词意的断章取义；句子层面笔者将其分成了三种情况：忽略隐藏逻辑造成译文的衔接不当；忽略语篇意图造成译文的生硬晦涩；忽略语篇背景，造成读者的一知半解。然后笔者从语篇语言学提出的语篇特征入手，针对每一个方面的局限性提出了相应的译文重构策略：在衔接性和连贯性方面，应该注重分析原文的隐性逻辑关系，适当补充连接词显化内在逻辑，保持译文的衔接性和连贯性；在语境性方面，应该根据源语语篇中的情景语境来解读原文和重构译文，慎重地选择最优译入语词汇；在意图性方面，应该充分揣摩原文的语篇意图灵活翻译，争取在选词或表达方式上准确传递作者的行文意图或者语篇中说话者的意图；在信息性方面，应该从读者的阅读经验和期待视野入手，适当补充原文隐含的背景信息，包括语篇内或者语篇外的信息，减轻读者的阅读负担。笔者首次从词汇和句子层面翻译的局限性入手来反证语篇标准在译文重构过程中的必要性。同时，笔者在语篇特征理论的指导下基于语篇的逻辑、语境、意图、背景这几个关键因素提出了相应的译文重构策略，并辅之以翻译实践中的译例进行正面论证。笔者希望本文的论证方法或者译文重构策略能够在一定程度上启发译者对语篇分析的重视。﹀
分类号：	H315.9
论文总页数：	201
参考文献总数：	0
馆藏号：	017/M2015(429)
公开日期：	2015-05-31

2015-05-29

基于自适应学习模式的英语从句语法教学研究.林毅君

链接

题名：	基于自适应学习模式的英语从句语法教学研究
姓名：	林毅君
学号：	1201210700
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	Adaptive Learning Mode Based English Clauses Grammar Teaching Research
关键词：	自适应学习英语从句语法教学推荐系统学习效率数据挖掘
外文关键词：	Adaptive Learning English Clause Grammar Teaching Recommendation System Learning Efficiency Data Mining
论文摘要：	︿随着计算机技术的发展，教学研究者想通过对教育相关数据使用分析与数据挖掘的方法，寻求改进高等教育机构教学体验和质量的新方式。数据挖掘能被用于实现每个学生个体学习活动的定制化。在国外，基于学生在英语外语教学课程中的表现情况，数据挖掘已多次被用于英语自适应学习和推荐。通过分析英语学习过程中学生的学习情况，自适应学习推荐系统能根据学生个体的具体情况动态地推荐学习材料，而非静态地安排课程内容。自适应学习理论在这种环境下应运而生。当前，国内外有许多教育者正在自适应学习理论的指导下，尝试构建各种自适应学习模式和平台。根据一门课程中学生个体不同的学习能力、掌握水平和学习情况，自适应推荐系统能推荐最适合学生的学习材料和内容，以提高学生的学习效率、改善学习体验和质量。随着人们对英语学习重视程度的提高，自适应学习理论在英语教育和教学研究领域中越来越受欢迎。对于学生英语词汇学习和英语阅读水平的提高等方面的自适应学习，教育研究者进行了诸多尝试。但在英语语法自适应学习上，国内目前的研究还很有限。本文主要研究自适应学习模式在英语从句语法陈述性知识学习中的应用。论文首先对国内外的自适应学习理论以及在此基础上建立的学习推荐系统、学习者能力特点组合模型进行了介绍。根据自适应学习理论，本文选取了数量合理的英语从句语法点作为学习重点，并围绕这些语法点进行知识库的建设，通过对学生建模、设定合理的自适应推荐规则和策略、并依此进行教学实验，以证明本研究所构建的英语自适应语法学习系统（自适应学习策略与规则）的有效性和可行性。本文采用了人工数据分析的方法，模拟计算机系统中由程序实现的学习推荐系统的运行和自适应推荐规则的使用。本文中构建的自适应英语从句语法学习模式能够动态分析出学生的学习效率和语法点掌握情况，依此进行有针对性的学习材料和练习题的推荐。教学实验的结果表明，在相同时间内学习内容完全相同的学习材料，对于英语从句语法陈述性知识，基于自适应学习模式的英语从句语法学习方法比非自适应方法更有效，学生不必重复学习，节省了学习时间，减小了学习压力。本文是对英语自适应语法学习系统构建的一次探索，希望通过本教学研究，能为提高英语语法学习效率、改善英语语法教学提供一种新方法和新思路。﹀
外文摘要：	︿ With the development of computer technology, teaching researchers want to seek new ways to improve teaching experience and quality of higher education institutions by means of analysis and data mining towards educational data. Data mining can be used to customize every student’s learning activities. Based on students’ performance in English as Foreign Language teaching courses, data mining has been used in English adaptive learning and recommendation for many times abroad. Through analyzing students’ learning details in English learning process, adaptive learning recommendation system can adapt to students’ individual specific circumstances of learning, then recommend learning materials dynamically rather than arrange course content statically. Adaptive learning theories emerge in this kind of environment. Currently, under the guidance of adaptive learning theories, there are many educators at home and abroad trying to construct all kinds of adaptive learning patterns and platforms. According to individual students’ different learning ability, level of mastery and learning condition in a course, adaptive recommendation system can recommend learning materials and content, which are most suitable for each student, to enhance students’ learning efficiency and improve learning experience and quality. With the increase of degree of attention people pay to English learning, adaptive learning theories are becoming more and more popular in the research field of English education and teaching. Education researchers have done many experiments in adaptive learning aspects of students’ English vocabulary learning and advance of English reading comprehension level. But current research of English grammar adaptive learning is still limited in China. This paper mainly researches the application of adaptive learning mode in English clauses grammar declarative knowledge learning. This paper first introduces adaptive learning theories at home and abroad and adaptive-learning-based recommendation system, learner ability and characteristics combination mode. According to adaptive learning theories, this paper chooses reasonable number of English clauses grammar points as learning key points, and build a knowledge base based on these grammar points. Through building a students’ model, setting reasonable adaptive recommendation rules and strategy, and doing teaching experiments, this paper wants to prove the effectiveness and feasibility of English adaptive grammar learning system (adaptive learning strategy and rules) constructed in this research. This paper adopts the method of manual data analysis to simulate the running of learning recommendation system and application of adaptive recommendation rules realized by programs in computer system. The building adaptive English clauses grammar learning mode in this paper can dynamically analyze students’ learning efficiency and grammar points mastery state, and recommend targeted learning materials and exercises by these parameters. The result of the teaching experiment shows that during the same time period and learning same materials with identical content about English clauses grammar declarative knowledge, the English clauses grammar learning method based on adaptive learning mode is more effective than non-adaptive learning method. With adaptive learning method, students needn’t learn what has been learned repeatedly, which can save a lot of time and decrease learning pressure. This paper is an exploration of the construction of English adaptive grammar learning system, which aims to provide a new way and new thinking for enhancing English grammar learning efficiency and improving English grammar teaching. ﹀
分类号：	H087/TP391
论文总页数：	110
参考文献总数：	0
馆藏号：	017/M2015(495)
公开日期：	2015-05-29

基于自适应模式的英语阅读教学研究.吕京

链接

题名：	基于自适应模式的英语阅读教学研究
姓名：	吕京
学号：	1201210734
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
关键词：	自适应英语阅读教学阅读能力文本分析
论文摘要：	︿信息技术的不断发展引发了人们对传统的自主学习模式的反思。在传统的自主学习模式中,学习者处于被动学习的地位,面对浩瀚如海的学习材料,学习者很难找到适合自己的学习内容,太难或者太简单的学习材料都不利于快速提高学习者能力。传统的自主学习模式忽略了学生的个性化差异,是一种粗放的“one-fits-all”的模式。因此,以学生个性化特征为基础的自适应的学习方法应运而生,而信息技术的发展为基于自适应的学习模式创造了条件,自适应学习系统开始发展。自适应学习系统首先分析学生的个性特征和行为,建立学生档案,包括学习模式、学习风格、学习习惯、学习轨迹,并对学生的初始能力进行分析,以此为依据对学习过程中的内容或者学习序列进行干预,推荐个性化的学习材料和学习路径。本研究将自适应的核心理念应用到英语阅读教学中来,以英语阅读论文为指导, 提出了基于自适应模式的英语阅读教学方法。本文首先分析影响学生英语阅读理解的多个因素,通过对学生的个性化分析和学习材料的深度挖掘,为不同特征和不同能力水平的学生推荐适合的学习内容、学习策略和学习序列。本文根据自适应学习系统的核心模块,以大学英语六级的仔细阅读为切入点,探讨了学习材料、学习者和英语阅读的自适应规则三个维度。学习材料从阅读文本材料的文本难度、六级词汇覆盖率、题目类型和阅读策略来衡量;学习者包括学生的词汇量、初始英语阅读水平、英语阅读焦虑度等因素;自适应规则主要体现在文本难度的自适应、根据学生的阅读成绩动态变化的文本难度序列的自适应、阅读前导的自适应和题目类型和阅读策略的自适应。笔者通过两次教学实验,论证基于自适应模式的英语阅读教学的可行性。第一次实验论证因材施教的教学模式的有效性,实验历时六周,以唐山学院 2013 级对外汉语的 40 名学生为研究对象,分为实验组和对照组,证明了以学生的个性化特征为基础的因材施教的英语阅读教学模式的有效性。第二次实验论证了基于自适应模式的英语阅读教学方法的有效性,实验历时六周,以河北金融学院 2013 级商务英语专业的 40 名学生为研究对象,随机分配实验组和对照组,通过问卷调查、前测、学习过程自适应调整、后测和访谈的形式收集数据,从定量和定性的角度分析自适应模式的英语阅读教学方法的有效性。经过 SPSS 统计软件的数据分析,本论文得出以下四个实验结论: 1. 以学生个性特征为基础的因材施教的教学方法的有效性。 2. 自适应模式的英语阅读教学方法对提高整体英语阅读水平有效。 3. 自适应模式的英语阅读教学方法对提高学生的阅读细节理解能力很有成效。 4. 自适应模式的英语阅读教学方法可以纾解学生的阅读压力,学生在情感上更倾向于自适应的英语阅读教学模式。﹀
分类号：	H087/TP391
论文总页数：	98
参考文献总数：	0
馆藏号：	017/M2015(521)
公开日期：	2015-05-29

基于翻译认知心理的新型机器翻译系统交互界面的研究.林毅超

链接

题名：	基于翻译认知心理的新型机器翻译系统交互界面的研究
姓名：	林毅超
学号：	1201210699
专业：	软件工程（二级学科名称）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
关键词：	翻译认知机器翻译机器翻译原文标注原文标注眼动追踪
论文摘要：	︿机器翻译和计算机辅助翻译工具的出现，大大提升了翻译的效率和质量，深刻影响着当前的翻译行业，而综合二者优势应运而生的交互式机器翻译，作为一种新型的机器翻译系统，在交互性能、翻译效率和翻译质量上都将是翻译技术界的又一次革新。本文以北京大学语言信息工程系当前正在开发的新型机器翻译系统为技术依托，通过对新系统交互界面元素的系列研究，力求提出一套最佳组合方案，为使用系统的专业译员提供更智能的基于翻译认知的界面交互提示，从而大大提升专业翻译绩效。本研究主要围绕“英译汉过程的阅读绩效”和“汉译英输入预测的绩效”两大主线展开，即在原文阅读和译文产出两方面进行深入研究。在阅读方面提出“原文标注”的设想，即通过机器预处理标注出原文中的难译词和诸如指代关系等特殊语法现象，从而对译者产生提示作用，提升翻译的精确度；在产出方面提出“输入预测”的设想，即利用机器翻译结果和相关语料库检索结果，为译者呈现出交互性和实时性更强的动态预测提示，从而节省译者的按键率并加速译者的思维过程，提升翻译的整体绩效。围绕这两大主线，本研究进行了四个基于翻译认知的眼动追踪实验，在英译汉的“原文标注”方面，探究难译词的标注对翻译效率的影响以及标注指代关系段落关键词组的眼动研究；在汉译英的“输入预测”方面，探究输入预测的合适提示词数以及候选词列表的颜色显示方案。通过分析四大实验所有被试的用时、速度、准确率等非眼动数据，以及注视点数、注视时长、视线转移次数、转移停留时长等眼动数据，最终得出“高亮”原文难译词可以提高汉译英的翻译绩效，指代翻译中关键词组的标注与否对指代关系判断绩效无直接影响，与句长和难度成正比的3-6词之间的“完整意群”预测长度可以提高英译汉输入预测提示绩效，以及“蓝渐弱”的候选词列表颜色渐变方案可以提高英译汉过程中候选词提示的绩效等重要结论，这些结论将进一步为新型机器翻译系统交互式界面的实现提供依据。﹀
分类号：	H087/TP391
论文总页数：	113
参考文献总数：	0
馆藏号：	017/M2015(537)
公开日期：	2015-05-29

基于语料库的中美时政新闻英语语体特征对比研究.范琳琳

链接

题名：	基于语料库的中美时政新闻英语语体特征对比研究
姓名：	范琳琳
学号：	1201210558
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	Corpus-based study on the comparison of Chinese and American political news register
关键词：	时政新闻多维度分析语体语言特征
外文关键词：	political news Multi-dimension approach register language feature
论文摘要：	︿本研究从中国的四个主流新闻网站和美国的七个主流新闻网站上收集了16万余字，200余篇的时政新闻，建立了中美时政新闻语料库，采用Biber提出的多维度分析法，并在此基础上新增语言特征，对中美时政新闻的语体特征进行对比研究，得出两者在六个维度上的差异，并在新闻传播角度下为中国时政新闻的创作提出建议；另外，作为对多维度分析法的补充，本研究采用定量分析与定性分析相结合的方法分析了中美时政新闻在“简洁性”，“客观性”和“生动性”方面的差异，并提出创作建议。在多维度分析中，本研究突破了前人一贯沿用或选用Biber的67个语言特征，将之相似度较高的特征进行合并，得到58个特征，在此基础上又新增了4个更加贴近于时政新闻的语言特征，最终共利用62个语言特征对中美时政新闻进行因子分析，得出两者在六个维度上具有显著差异性。其中，美国时政新闻表现出较好的互动性，叙述性，逻辑性，情感性和大众性；而中国时政新闻则表现出较强的正式性，信息性和官方性。此外，本研究利用统计语言特征的方法讨论了中美时政新闻在“简洁性”、“客观性”和“生动性”三个方面的不同：在“简洁性”方面，美国时政新闻的可读性较高，语言相对比较简洁，而中国时政新闻语言相对复杂，可读性低；在“客观性”方面，美国时政新闻的语言更加具有真实性，中国时政新闻的语言具有一定的政治倾向性；在“生动性”方面，美国时政新闻善用新造词和修辞手法，语言更加生动形象，而中国时政新闻生动性表现不足。最后，本研究根据多维度分析和语言特征统计的结果，在对外宣传的视角下提出了有利于提高中国时政新闻创作的策略：提高文本的互动性，减弱信息化、正式化和官方化程度，增强文章的情感性；以及适当选用简单句，多用直接引语，注意情态动词的使用和重视修辞的使用等。﹀
外文摘要：	︿ This study selected more than 160,000 words (over 200 texts) of political news from four major Chinese news websites and seven major American news websites, with which the China and USA political news corpus was established. Adding some new language features into Biber’s muti-feature/multi-dimension approach, this research studied the register differences of China and USA political news on six dimensions, and gave some suggestions to improve the writing of China’s political news. Also, in order to compensate the MD/MF approach, quantitative and qualitative analysis methods were used to study the two registers characters in “conciseness”, “objectiveness” and “vividness”. According to the results, this paper gave some advice to improve the translation or creation of China’s political news. In the MD/MF approach, this study added four new language features closely related to the political news register into the original 58 features and make factor analysis with those features, which break through the previous study methods of using or choosing from Biber’s original features. By analyzing the results, the remarkable differences between Chinese and American political news registers in six dimensions were displayed in this thesis. Thereinto, American political news register showed better interactivity, narrativity, logicality, emotionality and popularity; however, China’s political news showed its formality, informativity and officialese. In addition, this study discussed the differences between Chinese and American political news registers in the aspects of “conciseness”, “objectiveness” and“vividness”. On the aspect of “conciseness”, American political news got higher scores in calculating the readability and its language is relatively concise; the language of China’s political news is relatively complicated with lower readability. On the aspect of “objectiveness”, the language of American political news is more truthful, while China’s political news obtained political tendentiousness. On the aspect of “vividness”, American political news made good use of newly-coined words and figures of speech, so its vividness is better China’s political news. Finally, according to the results of MD/MF approach and statistical features, this study proposed some advice to improve the creation and translation of Chinese political news in the angle of publicity: to improve the interactivity and emotionality of the texts, weaken its formality, informativity and officialese; to use simple sentences, direct speech, to use modal verbs with cation and use rhetorical devices more, etc. ﹀
分类号：	H087/TP391
论文总页数：	83
参考文献总数：	0
馆藏号：	017/M2015(544)
公开日期：	2015-05-29

可及性视角下英语指示照应的翻译策略——以Institutionalization of UX的汉译为例.符吉聪

链接

题名：	可及性视角下英语指示照应的翻译策略——以Institutionalization of UX的汉译为例
姓名：	符吉聪
学号：	1201210568
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
关键词：	可及性指示照应指示照应词翻译策略
论文摘要：	︿ Institutionalization of UX一书旨在通过提供进阶式的指导来协助各公司或组织将用户体验（UX: User Experience）变成其内部的一项体制，直到最终实现“用户体验的体制化”。该书作为企业“用户体验体制化”的经典范本，受众面主要为广大互联网从业人员（尤其是大量的管理人员和互联网创业者）和传统行业中想要拥抱“互联网思维”的高层管理者。书中大量出现的指示照应现象给笔者的翻译项目造成了不小的困难，从而引起了笔者对指示照应的翻译研究。作为照应形式当中的一种，指示照应也是语篇衔接与连贯的重要手段，指示照应翻译的好坏也影响着译文语篇的链接与连贯。通过本次翻译项目的总结以及对类似书籍译文中指示照应翻译的研究，笔者发现指示照应的翻译中主要遇到了“指示不够准确”和“译文不够简洁”这两个问题。结合可及性理论，笔者从原文、英汉差异和译者主观因素三个方面对影响指示准确性的原因和因素进行了分析，并从英汉衔接照应对比的层面对影响译文简洁性的原因也进行了分析。这些分析不仅仅关注于其他译者所关注的指示照应词本身，还将视角延伸到了其所连接的名词及其指示对象。最后，根据这些分析，笔者以“提高指示准确性”和“增强译文简洁性”为目的提出了“增强指示词语的信息度”、“指示对象替换指示照应词”、“直译加注释”、“根据指示对象还原/增译名词”、“增加指示对象和指示词语之间的词汇衔接”、“省略指示词语，合并主语”和“省略与指示对象邻近的指示词语”这些翻译策略，较好地解决了笔者在指示照应的翻译中遇到的问题，并希望这些翻译策略能够对其他译者处理指示照应的翻译时有所启示。﹀
分类号：	TP391
论文总页数：	212
参考文献总数：	0
馆藏号：	017/M2015(577)
公开日期：	2015-05-29

深度学习在依存分析中的应用.黄苹苹

链接

题名：	深度学习在依存分析中的应用
姓名：	黄苹苹
学号：	1201210611
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	Dependency Parsing With Deep Neural Networks
关键词：	依存分析深度学习递归神经网络重排序
外文关键词：	Dependency parsing Deep learning Recursive neural network
论文摘要：	︿依存分析是建立在依存文法的理论基础上，通过分析句子中词与词之间的依存关系来表示句法结构的一项语法解析任务。依存文法假设句子中的核心动词是支配句子的中心成分，其他词则直接或间接的修饰该核心词。从形式上来看，依存解析将句子表示成一个弱连通图，节点表示句子中的词，有向弧表示词之间的中心-修饰关系。由于依存分析任务定义明晰，形式简洁，并且在很多其他任务得以有效应用，因此该任务一直受到计算语言学界的研究关注。目前基于统计学习的有指导依存分析模型可以分为两类：一类是基于转移序列的模型，该类模型从初始状态开始，通过不断选择移入-归约动作，达到终止状态，而整个转移序列则构造出一棵完整的依存树；另一类基于图搜索的模型直接对词之间的依存关系进行建模，该类模型将依存树的生成等价于在有向图中搜索最大生成树的问题。这两类模型虽然都取得了非常好的解析效果，但是模型都存在各自的缺陷：基于转移序列的模型并不直接对依存树建模，而是借助于转移序列，并且在解码阶段因采用贪心策略而面临”错误传播”问题；基于图搜索的模型虽然采用精确的搜索算法，但是解码时需要假设依存树中边与边之间相互独立，因而模型只能选择基于边的简单特征。除此之外，这两种模型在特征表示方面都面临以下问题: 一，模型只能利用基于词的、浅层的特征，语义特征无法得到有效表示，而语义对语法排歧具有很好的指示作用；二，模型特征都仅限于局部范围内，而长距离依赖是自然语言中的普遍现象。深度学习是近些年复兴的基于深层神经网络的一类机器学习模型的总称，由于其在特征表示、自动特征选择、语义学习与表示方面的强势表现，令很多传统模型难以企及。深度学习的模型架构和训练方法也日益发展完善。本文正是针对传统依存解析模型的不足之处，结合深度学习的优势，将深层递归神经网络扩展应用到依存分析任务中来。一方面，该深度模型利用了神经网络在语义学习和表示方面的特长，将语义特征融入到模型决策中；另一方面，递归神经网络能有效的融合整个子树的信息，因此特征更加全局。我们对递归网络结构进行一定的修改以适应依存树结构的特殊性；模型学习时，采用基于最大化间隔的损失函数，直接对依存树结构进行建模与学习。我们首先将该递归神经网络模型用于直接解码，具体的，我们采用剪枝策略和基于Eisner的束搜索算法进行解码。同时，我们也将该模型用在重排序阶段，对基础模型给出的k个备选解析树进行重新评价。我们以较有竞争力的Arc-standard系统为对比的基础模型，在PTB3.0英文数据集上，递归神经网络直接解码时在开发集和测试集分别取得 91.56 和91.41的UAS，比基础模型分别高出0.77和0.78的UAS值；在重排序阶段，我们的模型分别取得了92.84 的92.46 的UAS，相比基础模型分别提高了2.05 和 1.83 的UAS值，该解析效果也超过了二阶图搜索模型，证明了模型的有效性。同时通过对实验结果分析的分析可得，本文提出的模型在长距离依赖关系的判别上具有更大优势。本文工作的主要贡献在于：一，将递归神经网络模型扩展到依存分析任务中，针对传统模型中难以表示语义信息、特征浅层局部的缺陷，利用递归神经网络对语义和整个子树的信息进行表示，既提高了模型的准确率，又更好的判别出长距离依存关系。二，本文对传统递归神经网络结构进行修改，提出基于Pooling层的递归神经网络，为出度不一致的树结构的表示提供了一种解决方案。我们在解码时使用基于Eisner的束搜索算法，因而尽管模型融合了高阶的特征，却依然保持一阶图搜索模型的复杂度。﹀
外文摘要：	︿ Dependency parsing is inspired by the dependency grammar and aims at parsing the syntactic structure of a sentence in terms of dependency relations. It assumes that a finite verb is the structural center of the sentence , all other units in the sentence are connected to the verb either directly or indirectly. These connections between words are called dependencies, which describe the head-modifier relations between words, and the whole sentence can be represented as a weakly connected graph by these dependencies. Dependency parsing is simple and clear in the form, and are well suited for many other NLP tasks, like machine translation, knowledge extraction, etc. Therefore it is enjoying increasing attentions. Current methods on dependency parsing mainly falls into two categories: transition-based models and graph-based models. The former starts from an initial configuration and choose the next shift-reduce action greedily until reaching the terminal configuration, and then outputs the dependency tree structure. The later takes a different view and model the dependency relations directly. It decompose the dependency tree into small components and equals it to the maximum spanning tree problem in the graph theory. Thought these two different methods all achieves state-of-the-art performance, they all face two obvious limitations in feature representation: First, these models cannot represent semantics well, yet semantic information are strongly indicative in syntactic parsing. Second, these models can only use very local features, which will often fails to capture long-distance dependencies common in natural language. Deep learning emerges recently and shed light on semantic representation and auto-feature learning and selection, it wins by a large margin over traditional models in many tasks like language model, machine translation, Chinese words segmentation, etc. Inspired by its success in semantic representation, we exploit the recursive neural network in dependency parsing. Our main motivations are: First, deep neural networks are better at representation semantic information. Second, recursive tree can compress and represent all the information in a fixed length vector, which enables use to use more global features based on sub-tree compared to the traditional local, shallow features based only on local words. Each node in the dependency tree is expressed by two embeddings, one is the content embedding for the node itself, the other is the context embedding, which compresses all the descendants nodes of that word, in other words, the whole subtree information of that node. We adapts the standard recursive neural network structure for dependency trees, and trains the model with a max-margin loss defined on the whole tree structure. We first use this recursive neural network in the decoding phases, which searches for the highest scoring tree structure using Eisner-based beam search decoding algorithm. We achieve 91.56 and 91.41 UAS on development and test set , which is a 0.77 and 0.78 improvements compared to a competitive beam-search Arc-standard system. While as it can give a score on the whole tree structure, we can also use it as a reranking model. We achieves 92.84 and 92.46 UAS on development and test set, which is a 2.05 and 1.83 improvements over the baseline system. All experiments are based on PTB3.0 data set . Our contributions can be summarized in the following aspects: First and the most important, we uses deep recursive neural network to represent more global features and better captures semantics. These makes up for the two main limits in feature representation in traditional models.. Second, we extends recursive neural network into dependency parsing with some adaptions to the network structure. Though we achieve the goal of higher-order graph-based models in term of feature representation, it only runs in the same complexity as first-order models. ﹀
分类号：	H087/TP391
论文总页数：	46
参考文献总数：	44
馆藏号：	017/M2015(580)
公开日期：	2015-05-29

跨文化传播下中国用户对技术文档需求的实证研究——以归纳/演绎结构以及图文关系为例.李倩

链接

题名：	跨文化传播下中国用户对技术文档需求的实证研究——以归纳/演绎结构以及图文关系为例
姓名：	李倩
学号：	1101210751
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-29
关键词：	跨文化技术文档中国用户归纳和演绎图文关系
论文摘要：	︿全球化促进了国内外企业的交流，给设计产品文档的技术写作人员提出了新的挑战。技术写作诞生且成熟于西方，中国的技术写作也由跨国公司引入中国市场。目前中国的技术写作领域大量充斥着西方的写作风格、工具、标准等。对于写作人员来说，是毫无保留地使用西方已有成果还是有所选择？这取决于中国用户对于技术文档的需求，也就是怎么样的技术文档最能让中国用户更好地理解技术知识、使用技术产品并获得最佳的使用体验。本研究首先分析Hall、Trompenaars 以及 Hofstede的文化维度，总结出最能代表中西文化差异的维度：即高/低语境、普遍/特殊主义、专一/扩散、内部/外部控制、高/低权利距离、个人/集体注主义以及长期/短期导向。由这些维度出发，分析中西技术文档在结构、风格以及视觉设计方面的不同写作策略。然后通过两个实验分别从结构和图文关系探讨哪些技术文档的设计策略最适合中国用户理解技术文档的描述性信息以及使用步骤性信息完成任务。实验一研究归纳和演绎两种不同的篇章结构对于用户阅读理解和对文章偏好的影响。50名被试被随机分为两组，分别阅读内容相同但是结构不同的技术文档，阅读后完成完形填空、找出文章中心句并对文章进行评价。结果表明被试阅读归纳结构文章时速度更快，并且完形填空正确率更高。然而并未发现两组被试在中心句识别正确率以及对文章评价方面有差异。因此，若向中国用户介绍某种新的技术概念，以归纳的结构组织文档更能促进用户的理解。实验二研究五种不同的图文关系：纯图片、纯文本、冗余图文关系、补充图文关系、互补图文关系对于用户初次学习软件操作效率的影响、用户迁移学习的影响、用户对于软件和手册评价的影响。55名被试被随机分为五组，分别使用不同图文关系的软件操作手册完成软件操作任务以及对软件以及使用手册进行评价。结果发现图文结合的技术文档更能帮助用户完成任务且用户满意度更高。而在图文结合的手册中，补充图文关系手册表现最佳，互补关系手册次之，冗余关系手册最差。纯图片手册与纯文本手册的优劣势不同，纯图片手册的优点是帮助用户快速完成任务，而缺点是完成任务正确率低。纯文本手册优点是帮助用户正确完成任务，而缺点是需要的时间较长。因此，希望用户更快、更满意地完成软件操作任务，图文结合的手册比纯文本或者纯图片的操作手册更好，而图文结合的手册中，补充图文关系的手册表现最好。本研究通过对比中西现有的技术文档写作策略以及两组实验测试中国用户的行为和偏好，证明无论是设计技术文档还是将技术文档本地化都需要考虑目标用户的使用习惯，而不是照搬照抄西方的习作策略。﹀
分类号：	H087/TP391
论文总页数：	131
参考文献总数：	0
馆藏号：	017/M2015(590)
公开日期：	2015-05-29

软件行业英语应用移动学习资源库构建研究——以Project X为例.袁凯

链接

题名：	软件行业英语应用移动学习资源库构建研究——以Project X为例
姓名：	袁凯
学号：	1201210962
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	Construction of Mobile Application Resource on English Learning for Software Industry——A Case Study of Project X
关键词：	软件行业英语移动学习资源构建
外文关键词：	English for Software Professionals Mobile Learning Resource Construction
论文摘要：	︿伴随着软件行业蓬勃发展，对人才的要求也越来越高，急需有国际竞争力的高水平人才。软件行业英语的学习需求量也因此日益增加。随着移动信息技术的高速发展，ESP教学传统的资源支持方案受到移动信息化的冲击，急需资源提供方式的改变。ESP移动学习资源库的建设，不仅在提高ESP教学质量、挖掘ESP教学的发展潜力方面发挥着重要作用，而且也可以最大限度的发挥学习者的主动性和积极性。但是，目前国内的ESP移动学习资源库建设的情况却并不乐观，资源的建设并不能满足学习者实用需求和教学者的使用需求，资源没有按照教学活动的需要进行提供，资源共享率不高。如何将ESP教学的最佳实践应用于移动学习的环境，建立规范、实用、可共享的移动学习资源库，已经成为有待解决的问题。本文从资源库的移动教学实践、资源分类与建设、资源应用三个层面进行了研究，尝试提出一种行业英语移动学习资源库的构建方法。本文首先通过分析现有ESP教学理论和ESP教学典型实践，得到了当前ESP学习资源和教学活动的对应情况。通过与移动学习的特点结合，给出了适用于移动学习环境的教学活动以及对应资源。然后以软件行业为例，通过对软件行业英语学习者的学习需求、目前市场上的行业英语学习应用以及行业英语课程展的调研分析，为资源的分类以及构建提供了指导。据此，本文从资源类型、资源来源、资源建设以及资源整合与管理四个方面对软件行业英语学习资源库的构建进行具体研究。最后通过资源库的资源支持构建出具体教学实践并进行了教学效果的初步验证。结果表明本文提出的资源构建方式能初步满足用户对软件行业英语的需求。在当前跨学科移动学习资源短缺的情况下，Project X资源库可以对软件行业移动英语学习提供一定的帮助。﹀
外文摘要：	︿ With the software industry flourishing, more and more talented people are needed in the international market. Therefore, the need for learning English on software industry is also on the rise. ESP teaching resources that support the traditional training has suffered great challenge and need to change its way to support. Construction of ESP mobile learning repository can not only plays an important role in improving the quality of teaching ESP and the development of potential ESP teaching, but also to maximize the initiative and enthusiasm of learners. However, the current domestic mobile learning resource of ESP is not optimistic. Resources cannot meet the practical needs of learners and the instructor's needs. Resources are not designed in accordance with the needs of teaching activities. It is also difficult to share. How to apply the best practices for ESP teaching to mobile learning environment and establish standard and practical resources has become a problem to be solved. From the perspective of mobile teaching practice, resource classification and construction, we try to put forward a method to build ESP mobile learning resources. Firstly, by analyzing the existing ESP teaching theory and typical teaching practice, we conclude ESP teaching activities and its resources. By combining the characteristics of mobile learning, we give the best practice in mobile ESP learning as well as its corresponding resources. Then we do some research on the software industry English learner, the current industry research and analysis of English learning applications for their need and classification of the resources which can provide guidance to build. Accordingly, from the resource type, source of resources, resource development and resource integration and management, we do some research on these specific aspects. Finally, we provide a real teaching case with support of resources and try to do preliminary validation of teaching. The results show that the proposed way to build resources can meet the initial needs of users of the software industry in English. It can improve shortage of interdisciplinary mobile learning resources and can provide some help for learning ESP. ﹀
分类号：	H087/TP391
论文总页数：	71
参考文献总数：	42
馆藏号：	017/M2015(725)
公开日期：	2015-05-29

机器翻译译后编辑对英汉翻译效率提升研究.张路露

链接

题名：	机器翻译译后编辑对英汉翻译效率提升研究
姓名：	张路露
学号：	1201210986
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-29
关键词：	译后编辑机器翻译翻译效率翻译质量译后编辑策略
外文关键词：	Post-editing Machine translation Translation productivity Translation quality Post-editing strategy
论文摘要：	︿随着翻译市场的逐渐扩大和机器翻译技术的逐渐成熟，通过机器翻译译后编辑的方式生产译文成为除翻译记忆技术以外的另一种提升翻译效率的机器辅助翻译方式，以应对大量的翻译需求，帮助企业和机构降低本地化成本，缩短产品上市时间并提高客户满意度。目前已有部分关于译后编辑提升效率的研究。但是由于汉语语言不同于印欧语系语言，初级译员在英译汉的情况下使用译后编辑能否达到同印欧语系翻译相当的翻译效率提升程度，值得进一步研究。译后编辑方式可以提升的翻译效率对于本地化公司和初级译员的翻译实践工作有实际的意义。本论文以机器翻译加译后编辑的翻译方式为主要研究对象，以Google Translate 作为机器翻译引擎为例，对比两种不同类型的文本，设计实验研究英译汉译后编辑对翻译效率的提升情况。本文立足于非专业化的翻译环境背景，选用免费的Google Translate作为机器翻译引擎，以译文能完整准确传达原文含义并尽可能地快速完成译后编辑过程为目标，分别选择新闻、操作手册两种不同类型的文本，设计研究实验从翻译耗时和击键行为两个维度讨论译后编辑方式对翻译效率的提升，并分析句长、句型和句子的易读性与译后编辑效率提升的相关性。通过实验研究，笔者得出结论：基于本文的实验材料，初级译员使用译后编辑方式在节约文本的整体翻译时间上效果明显，技术文档效率提升36.53%-44.01%，新闻文本效率提升20.77%-46.43%。采用译后编辑方式的翻译过程被试的停顿时间相应有所减少（5.17%-23.34%），但在多数情况下会增加停顿时长占总翻译时间的比例。译后编辑方式对译文开始输出前耗时的影响较明显，但可能受文章本身特征及其他因素的影响，结果具有不稳定性。在译员击键行为方面，译后编辑方式可以大幅度降低译文相关字符的录入数（53.69%-75.97%）。研究结论仍需大样本进行进一步验证。﹀
外文摘要：	︿ With the expansion of translation market and the maturity of machine translation technology, post-editing of machine translation has become another productive way apart from using translation memory to produce translation. Machine translation with post-editing could meet the increasing translation needs, help reduce localization cost, shorten the time before the product appearing on the market and improve customer satisfaction. Because Chinese has its own language characteristics, it needs further study on the question of the productivity gain in the post-editing of outputs from English-Chinese machine translation by novice translator and whether it is of the same level with other Indo-European language pair. The productivity has practical meaning for localization companies. In this paper, experiment will be designed to study on the productivity of post-editing English to Chinese machine translation output versus the traditional translation through two different text genre. Google Translate will be used as the machine translation engine. And it will find out the productivity gain with full considering of each variants. In this paper an experiment is set up based on the setting of non-professional translation environment, with Google Translate as the general machine translation engine, rapid post-editing as the goal, two abstracts from news and user manuals as the source of the translation content. Productivity gain from post-editing will be discussed through analyzing the experiment results of editing time and keystrokes. The correlation between text characteristics and productivity gain will also be discussed. Through the experiment results, it concludes that post-editing brings 36.53% to 44.01% productivity gain for technical documents and 20.77% to 46.43% for news document. Pause time during translation also decrease by 5.17% to 23/34%, but the percentage of pause time in translation will increase. Post-editing will also affect the time before starting to editing target text. Letter-keys and symbol keys will decrease by 53.69% to 75.97% thanks to post-editing. The conclusion is only for experiment reference. In the future it still needs large samples to ensure the accuracy of the productivity gain. ﹀
分类号：	H087/TP391
论文总页数：	64
参考文献总数：	0
馆藏号：	017/M2015(767)
公开日期：	2015-05-29

中国饮食全球化进程中的菜名英译研究.李秀颖

链接

题名：	中国饮食全球化进程中的菜名英译研究
姓名：	李秀颖
学号：	1301210744
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	A Study on English Translation of Chinese Dish Names in the Globalization of Chinese Food
关键词：	中国饮食全球化中餐菜名英译研究
外文关键词：	Chinese Food Globalization Chinese Dish Names English Translation Study
论文摘要：	︿全球化背景下，中国饮食已走出国门，走向世界，行进在全球化浪潮的前列。在中国饮食的全球化进程中，中餐菜名英文译法作为外国友人了解中国饮食的途径，无疑发挥着至关重要的作用，因此也倍受国内译界学者的关注和重视。近二十年来，有关中餐菜名英文译法的研究已积累了颇为丰硕的成果，对国内餐饮行业中餐菜名英文译法的规范和统一起到了一定的推动作用。为进一步促进中国饮食的全球化进程，提升国内中餐行业对外服务水平，本篇论文试图对中餐菜名英译展开专项研究，首先通过对国内外中餐菜名英译实例考察和分析，总结出影响中餐菜名英译效果的三大因素，其次根据影响因素提出中餐菜名英译的原则及策略，最后通过问卷调查的方式对论文结论的有效性及有效性程度进行验证。本篇论文在理论分析和实证分析的基础之上得出中餐菜名英译的原则及策略，并通过问卷调查法得出各项英译原则在实际应用中的参考价值及重要性的高低，从高到低依次分别为传递菜肴信息、提升用户体验以及弘扬饮食文化，相应的翻译策略分别为增添信息法、备注提示法以及文化加注法。据此，译者在中餐菜名英译的实际过程中可对中餐菜名英译原则及策略进行倾向性选择，尽量实现中餐菜名英译的最佳效果，即在最大程度上符合目标顾客的心理预期，提升目标顾客的就餐体验并促进中国饮食文化的传播。本篇论文的研究结果有助于国内译界学者更加有效地开展日后中餐菜名的英译工作以及中国饮食文化的外宣工作。﹀
外文摘要：	︿ In the era of globalization, Chinese food has spread to the whole world. Therefore, the English translation of Chinese dish names acts as an important way for the foreigners to know about the Chinese food as well as its background culture, which has been attached great importance to by the domestic scholars. In nearly two decades, fruitful results have been obtained in the studies on English translation of Chinese dish names, which has promoted the normalization and standardization in domestic food industry. This paper studies on the English translation of Chinese dish names from three aspects. First, the paper summarizes three factors which affects the translation effect through the analysis of examples of the English translation of Chinese dish names. Second, principles and strategies of English translation of Chinese dish names are proposed. Third, the effectiveness of the conclusion is verified by the survey of questionnaire. This paper puts forward several principles on the English translation of Chinese dish names, i.e. conveying dish details, improving customer experience, and promoting Chinese culture. The corresponding translation strategies are information addition, remarks & features, and cultural connotation. According to the above analysis, translators can choose appropriate strategies during translation and create the best effect of translation, so as to achieve the psychological expectations of western customers, improve their dining experience and promote the spread of Chinese food culture. The paper is conducive to the study of the English translation of Chinese dish names as well as the propaganda of Chinese culture. ﹀
分类号：	H087/TP391
论文总页数：	75
参考文献总数：	34
馆藏号：	017/M2015(785)
公开日期：	2015-05-29

科技博客的语言特点和编译策略研究.孙瑜

链接

题名：	科技博客的语言特点和编译策略研究
姓名：	孙瑜
学号：	1201210794
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
关键词：	科技博客新媒体传播语言编译策略
外文关键词：	Tech blogging New media Mass media Language Trans-editing strategies
论文摘要：	︿科技博客最早以编译国外的博客文章起家,编译策略对科技博客的传播效果起到至关重要的作用。本论文以科技博客为主要研究对象,从文本和超文本两个维度研究科技博客编译的影响因素,并试图探究科技博客和新闻在语言特点和编译策略上的差异。在语言特点方面,利用语料库为分析工具,对科技博客做了文本和超文本研究。在文本分析方面,分别从词汇、句子、语篇三个层面进行了统计学分析。在词汇层面,统计和分析了人称代词、科技术语等科技博客较为明显的词汇特征。在句子层面,统计和分析了平均句长、句子时态和语态。在语篇层面,统计和分析了连接词的使用情况,并简要分析了科技博客的语篇特点。在超文本分析方面, 分别从图片、超链接等新媒体特征和意识形态层面出发分析了科技博客的特点。最后得出如下的结论:和新闻相比,科技博客术语多、互动性强、句子结构较为简单;在篇章上更为精炼,直入主题,专业性强,文化相关性小;同时科技博客也继承了新媒体的一些特征:例如多超链接、图片等。在传播特点方面,受众、媒体风格、时间和出版因素都会不同程度地影响科技博客的编译效果。笔者分析了受众对编译及传播效果的影响,以相关传播学的理论为基础,从受众定位、受众期待这两个方面解释了受众对于科技博客编译的影响,主要体现在受众的专业性和快速获取信息的诉求。其次通过个案分析的方法论证了媒体风格对于科技博客编译的影响;同时笔者发现时间和媒介因素也是编译策略的影响因素之一。通过研究,笔者得出结论:由于语言特点、受众和传播媒介的不同,科技博客的编译策略和普通新闻的编译策略存在一定的差异,主要体现在零翻译策略、新媒体特征的运用、增译和删减内容的侧重不同等。最后,笔者通过深度访谈编译工作者的形式验证了结论的合理性。﹀
外文摘要：	︿ As tech blogging was first introduced into China by trans-editing, trans-editing plays a crucial role in the communication of tech blogging. In this paper, through text analysis and hypertext analysis, the author regards tech blogging as the main object of the study, trying to explore the differences between tech blogging and news in language features and trans-editing strategies. In terms of linguistic features, the corpora are applied in textual and hypertextual analysis of tech blogging. In textual analysis, words, sentences and discourse analyses have been conducted respectively. In the lexical level, the numbers and frequency of the personal pronouns, technical terms and other technology blog obvious lexical features are calculated. In the sentence level, the average sentence length, sentence tense and voice are analyzed. In the discourse level, as a sigh of sentence complexity, frequencies of conjunctions are calculated according to the corpora. Besides, discourse analyses are carried out to compare the text structure of tech blogging and news. On the other hand, the hypertext analyses are mainly focused on the pictures, hyperlinks and other new media features and ideological level analysis of the tech blogs. To conclude, compared to news, tech blogging are more interactive and have more technical terms; With relative simple sentence structure and more refined chapters, tech blogging are always straight into the subject. It is more professional in techical area, thus less relative to culture; Also, tech blogging also inherit some features of new media: such as hyperlinks and images. In terms of communication characteristics, audience, media, time and publishing are key factors that will affect the results of trans-editing. The author analyzes the impact of the audience on the trans-editing and communication results. Based on theoretical mass communication studies, the influence of targeting audiences and their expectation are studied. The difference lies in the professional background of audience and their needs for quick access to information. Secondly, the influence of media style is conducted by case studies. Besides, time and media factors also influence the results of trans-editing. In conclusion, due to the differences in language features, the audience and the media channel, and tech blogging trans-editing strategies are distinct from those of news,such as zero translation strategy, use of new media characteristics, the content of adding and deletion, etc. Finally, in-depth interviews with reporters have been conducted to verify the rationality of the conclusion. ﹀
分类号：	H087/TP391
论文总页数：	78
参考文献总数：	0
馆藏号：	017/M2015(388)
公开日期：	2015-05-29

基于自适应学习模式的大学英语产出性词汇教学研究.徐亮

链接

题名：	基于自适应学习模式的大学英语产出性词汇教学研究
姓名：	徐亮
学号：	1201210899
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	A Comparative Study of Productive Vocabulary Acquisition among College Students Based on Adaptive Learning
关键词：	自适应学习二语词汇习得产出性词汇词汇学习
外文关键词：	Adaptive learning L2 vocabulary acquisition Productive vocabulary Vocabulary learning
论文摘要：	︿词汇学习是第二语言习得最关键的步骤之一，但由于学习者原有词汇水平存在较大差异，现有的词汇学习方法很难满足每个学习者的需求，而自适应学习能根据学习者的相关信息和自适应规则为不同水平的词汇学习者推荐合适的学习材料，实现词汇学习的因材施教。本研究以大学英语产出性词汇的教学为切入点，根据自适应学习和第二语言词汇习得的相关研究成果，尝试将自适应学习方法应用到大学英语产出性词汇的教学中，完成大学英语产出性词汇学习系统的原型设计。本研究设计的产出性词汇学习系统主要分为四个部分：学习资料、用户界面、自适应规则和复习巩固。学习资料部分主要负责大学英语产出性词汇表的确定和词汇学习资料的选取；用户界面部分主要负责词汇学习界面的交互；自适应规则部分主要负责个性化词汇学习材料推荐和社会情感策略指导；复习巩固部分主要负责根据记忆遗忘规律安排个性化的词汇复习。在词汇学习过程中，系统能够根据学习者当前的学习情况和预设的自适应规则为每个学生推荐符合其自身水平的学习资料、复习资料、学习方法和指导意见，提高了词汇学习的效率，培养了学生良好的词汇学习习惯。为了验证系统设计的有效性，本研究在山东某高校60名非英语专业一年级本科生中进行了为期3周的教学实验，目标词为180个大学英语产出性词汇。对照组采用传统非自适应词汇学习方法，所有被试的学习进度、学习任务完全相同；实验组采用本研究设计的自适应系统进行词汇学习，每个被试的学习材料均按需提供，由于实验组被试在一定程度上可以控制学习进度，实验结束时部分实验组被试超额完成了学习任务，部分被试则未完成学习任务。教学实验采用VKS（Vocabulary Knowledge Scale）和VLT（Vocabulary Levels Test）两种产出性词汇测试方法进行即时测试，2周后进行延时测试。另外，本研究还通过调查问卷和访谈的方法了解了两组学生对学习过程、学习效果的主观感受。研究结果表明：本研究设计的自适应词汇学习系统能更有效地促进大学英语产出性词汇的习得效果和保持效果，同时也更受学习者欢迎。﹀
外文摘要：	︿ Vocabulary acquisition is one of the most important steps in second language acquisition. However, the current vocabulary instruction cannot meet the needs of all learners with different proficiency levels. Adaptive learning has provided a solution that can offer personalized learning materials through recording and analyzing related information of different learners. This paper tries to apply adaptive learning method in productive vocabulary instruction based on related study of adaptive learning and second language vocabulary instruction, and also gives out a prototype design of productive vocabulary tutoring system. The system designed in this paper consists of four parts, namely learning materials, user interface, adaptive rules and revision. The learning materials part mainly focuses on the composition of productive wordlist and related learning materials. The user interface part mainly focuses on the interaction with students. The adaptive rules part mainly focuses on personalized recommendation of learning materials. The revision part mainly focuses on personalized revision based on forgetting curves. During the learning process, the system can provide suitable learning materials, revision materials and personalized guidance on vocabulary learning. In this way, the system can help students to improve learning efficiency and develop a good vocabulary learning habit. In order to demonstrate the effectiveness of the system, the paper has conducted an empirical study among sixty freshmen from a university in Shandong province for three weeks. There are 180 target words. The control group learns vocabulary with traditional instruction method and all participants share the same pace and task, while the experimental group learns vocabulary with the vocabulary tutoring system designed in this paper and all learning materials are provided according to different proficiency levels. Since the participants in experimental group can control learning progress by themselves to some extent, some of them finish the scheduled task ahead of time and some don't finish the task. After the experiment, all participants are required to take part in two unexpected productive vocabulary tests: an immediate test and a delayed test. For each test, both VKS (Vocabulary Knowledge Scale) and VLT (Vocabulary Levels Test) are applied. The study has also conducted a questionnaire and a survey to know the attitude of learners towards the two learning methods. The results show that the vocabulary tutoring system designed in this paper can enhance both lexical learning and lexical retention, and is well-received among students. ﹀
分类号：	H087/TP391
论文总页数：	79
参考文献总数：	0
馆藏号：	017/M2015(394)
公开日期：	2015-05-29

语言服务团队术语管理能力评估.许欣蕾

链接

题名：	语言服务团队术语管理能力评估
姓名：	许欣蕾
学号：	1201210910
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	Terminology Management Capability Assessment for Language Service Team
关键词：	术语管理能力评估语言服务团队
外文关键词：	Terminology management Capability assessment Language service team
论文摘要：	︿随着语言服务行业的发展，术语管理在语言服务中的重要性也逐渐得到重视。但是由于各种原因，良好的术语管理意识和操作尚未得到广泛普及。国内关于术语管理的研究较国外而言较为滞后。目前，欧盟有认证的术语管理人才，而国内在这方面还是空缺。本论文所要研究的是语言服务团队术语管理能力评估。本文采用了论文调研法、访谈法、问卷调查法以及实验法，提出了一套评估问卷并验证其有效性。首先通过论文调研，将语言服务团队术语管理能力分为四个层面，分别为：人员、技术、流程和内容。在广泛阅读国内外相关论文的基础上，从这四个层面对前人的理论和实践研究做了综述。基于此，笔者设计了详细的访谈提纲，拜访九位行业内有术语管理经验的专家进行访谈。参考资深人士对于术语管理能力自评问卷项目的重要性打分并结合前期的论文调研，笔者设计出了语言服务团队术语管理能力自评问卷。此后，笔者邀请行业内26个语言服务团队进行该术语管理能力评估。回收完这些团队的问卷后，笔者对其进行了反馈并分析了评估情况。笔者对自评问卷进行指标分析，验证自评问卷的评估效力，并进行相应的调整。为证明自评问卷的实际应用价值，笔者请北京语智云帆科技有限公司的2个团队进行实验。一组为控制组，一组为实验组。这两组团队均参加了术语管理能力自测并得到笔者反馈。笔者根据其中一组的答卷情况，设计了改进方案，对其进行术语管理培训。在术语项目任务一中，实验组的改进措施使得实验组的效率明显高于控制组；任务二，两组均为实验组。以实验前的效率为参照，实施改进措施后的效率明显高于改进前，抽查的术语质量也有所提高。参考受试者的反馈，本论文的自评问卷内容全面，所设置的题目能够有效地帮助受试者找到自身在术语管理方面的弱点；实验证明基于自评问卷的术语管理改进方案可以提高团队术语管理的效率。经过培训和实施术语管理改进方案，实验组的术语管理成果优于控制组，术语管理的效率高于实验前。综上，本论文提出的术语管理能力评估方法能够为改进术语管理提供依据，对提高语言服务行业的术语意识，改善术语管理措施有积极意义。﹀
外文摘要：	︿ With the development of language service industry, the importance of terminology management in language service is becoming ever more significant. However, due to various reasons, there is still lack of widespread awareness of terminology management and mature process. The research on terminology management in our country still lags behind abroad. Currently the European Union has certified terminology managers while our country doesn’t have equal examination or certification in this field. The dissertation focuses on terminology management capability assessment of language service team. The research methods include thesis research method, interviews, surveys and experiment. Through thesis research, the author categorizes terminology management capability into four aspects, including staff, technology, process and content. Through intensive reading of thesis at home and abroad, the author makes summary on the theories, practice and research. On the basis of thesis research, the author designs detailed interview outline and invited nine professionals with terminology management experience for interview. Those professionals mark scores on importance of each element of terminology management capability. Taking this and the thesis research into account, the author designs the terminology management capability assessment of language service team. Besides, the author invites twenty six language service teams to test their terminology management capability. After receiving the questionaires, the author analyses the results and gives each team detailed feedback. The author also analyses the measurement indexes of the assessment to prove that the assessment has validity, proper difficulity level and good discrimination indice. In the end, to prove that the assessment has pratical meaning, the author invite two randomly assembled teams from Beijing Lingosail Tech Co., Ltd. To participate in the experiment. One of the team is controlled team and one is experiment team. The author makes improvement suggestions based on the experienment team’s results and offers training to them. In Task 1 of terminology project, after applying the improvement solution, the performance of the experiment team is better than the controlled team. In Task 2, both of the team serve as experiment teams. Compared with the performance before the experiement, after applying the improvement solution, their efficiency is greatly improved. And they do not sacrifice the quality according to the result of randomly selected sample terms. To sum up, the terminology management capability assessment is thorough. It can effectively help the participants find their drawbacks in terminology management and help them to improve; The experiment shows that the assessment has practical meaning. Having applied the improvement solution based on the assessment, the performance of the experiment team is better than the controlled team and the performance after experiment is better than before. The terminology management capability assessment can help to improve terminology management capability, raise the awareness of the industry on terminology management and have a positive impact on improvement of the practise of terminology management. ﹀
分类号：	H087/TP391
论文总页数：	111
参考文献总数：	39
馆藏号：	017/M2015(402)
公开日期：	2015-05-29

跨文化视角下央企英文社会责任报告文本分析.程千

链接

题名：	跨文化视角下央企英文社会责任报告文本分析
姓名：	程千
学号：	1201210540
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	TEXT ANALYSIS OF CHINESE STATE-OWNED ENTERPRISES’ CORPORATE SOCIAL RESPONSIBILITY REPORTS UNDER CROSS-CULTURAL PERSPECTIVE
关键词：	跨文化央企社会责任
外文关键词：	Cross-culture Chinese state-owned enterprises CSR
论文摘要：	︿随着全球经济一体化的不断深入，央企的跨文化传播的需求日益明显。在央企的跨文化传播过程中，企业社会责任（Corporate Social Responsibility，简称CSR）这方面的信息披露一直受到国外资本市场监管部门和投资者的关注。在过去几年内，央企遭遇了一些负面的论调，有时正面信息在跨文化传播的过程中也被负面化，无法达到传播目的，这与受众所在国与中国之间的文化差异对信息披露的内容和方式等的影响有着密切关系。鉴于以上背景，本研究以文化维度理论和高低语境理论为指导，结合语体研究法和内容分析法，选取了30家央企近两年发布的英文CSR报告作为语料进行文本分析，并以30家美国企业的CSR报告为参照对象，通过差异性对比，发现央企在跨文化传播中的不足之处及背后的成因，并尝试提出改进建议。在语言层面，建立单语可比语料库进行语体分析，采用了语体正式程度和避免直接称呼程度中的8个衡量指标，并增加了政治敏感度这一指标。在内容层面，首先，利用内容分析法中的指标模型，从企业社会责任类型和利益相关者类型展开研究，选取关键词，进行词频统计分析；其次，采用定性分析和案例分析，探讨央企和美企CSR报告中的负面信息披露的数量和语言策略上的差异。研究发现，在企业社会责任类型方面，央企更侧重于可持续性发展责任，而美企更看重经济责任。在利益相关者方面，央企对于社会性利益相关者、股东和政府的关注度较高，而美企对首要利益相关者、员工和社会公众的重视程度较高。此外，央企的CSR报告比美企的CSR报告的正式程度和避免直接称呼程度更高，且具有较高的政治敏感度。最后，本研究依据文化维度理论和高低语境理论对这些差异性进行原因分析，结果符合研究假设。本研究是对央企企业社会责任的跨文化传播这一研究主题的一次创新探讨，期望能为央企实现更有效的跨文化传播提供有益的参考。﹀
外文摘要：	︿ With deepening economic integration, the demand for cross-culture communication for state-owned key enterprises is increasingly obvious. During cross-culture communication of Chinese state-owned key enterprises, Corporate Social Responsibility (CSR) has always been the shared concern of foreign supervision departments and investors in capital markets. The past few years oversaw Chinese state-owned key enterprises encountered unfavorable views, and their positive information sometimes goes into negativeness,which made foreign citizens, stockholders and partners skeptical in Chinese state-owned key enterprises’ performances in social responsibilities. This is closely related to influences ofcultural differences between countries of recipients and China on what to disclose and how to disclose. Guided by “Culture Dimension” and “Context of Culture” theories, combined with the linguistic theory of register and content analysis, this thesis implements the research on 30 English-version CSR reports from 30 Chinese state-owned key enterprises. The reference objects are CSR reports from 30 American enterprises. From a cross-cultural perspective, the thesis makes an analysis of linguistic and content comparisons between CSR reports of Chinese state-owned key enterprises and American enterprises, in order to find out the inadequacies of Chinese state-owned key enterprises in communicating its CSR on a cross-culture background and bring forward some recommendations for improvement. In terms of content comparison, firstly,the thesis adopts the indicators model of content analysis, selects keywords of first class indicators and second class indicators, analyzes keywords frequency and makes an F-test on the results. Secondly, the thesis adopts the qualitative analysis and case study, further exploring the methods of Chinese state-owned key enterprises in disclosing negative information in CSR reports.As for linguistic comparison, this thesis uses two scales of register indicating the addresser’ attitude towards addressee -- “formality” and “impersonality”. Besides, the thesis adds “sensitiveness to politics” as a linguistic indicator. As the research results show, CSR reports of Chinese state-owned key enterprises has higher formality degrees and impersonality degree than that of American enterprises. CSR reports also indicate that with regards to the CSR type, Chinese state-owned key enterprises comparatively focus on the sustainable development responsibility, while American enterprises focus more on economic responsibility. As for the stakeholders, Chinese state-owned key enterprises tend to value responsibility towards the government and stockholders, while American enterprises pay more attention to responsibility towards employees and the public. At last, the thesis probes into the underlying reasons based on “Culture Dimension” and “Context of Culture” theories, and concludes that the research results conform to the assumptions. The study is a little breakthrough on the cross-culture communication of CSR of Chinese state-owned key enterprises, in hope of providing some guidance to help Chinese state-owned key enterprises achieve more effective cross-cultural communication. ﹀
分类号：	H087/TP391
论文总页数：	69
参考文献总数：	0
馆藏号：	017/M2015(364)
公开日期：	2015-05-29

中德现场工程师的跨文化冲突研究——以北京奔驰发动机项目为例.刘天意

链接

题名：	中德现场工程师的跨文化冲突研究——以北京奔驰发动机项目为例
姓名：	刘天意
学号：	1201210713
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	张宏岩
导师1单位：	软件与微电子学院
导师2姓名：	朱源
导师2单位：	中国人民大学外国语学院
论文答辩日期：	2015-05-29
外文题名：	Research on Intercultural Conflicts between Chinese and German Engineers on Shop Floor--A Case Study of BBAC Project
关键词：	文化差异跨文化沟通跨文化冲突沟通技巧
外文关键词：	Cultural difference Intercultural communication Intercultural conflict Communication skills
论文摘要：	︿多年来，中德经贸合作关系是两国关系发展的重要部分，但是巨大的文化差异使双方员工在合作共事中容易出现跨文化冲突。在国际工程项目的团队成员中，现场工程师是最重要的组成部分之一，他们之间的跨文化冲突直接影响团队和谐与项目的顺利进行；但是国内外对中德跨文化沟通的研究起步晚，对国际工程项目中现场工程师的跨文化冲突的研究更是屈指可数。因此，对中德现场工程师的跨文化冲突研究具有重要的理论和实践意义。基于上述背景，本文在以下几方面做出研究：分析了研究的理论背景与意义、行业背景和需求，对研究对象北京奔驰发动机项目以及西门子跨文化团队进行介绍，介绍了研究的主要内容和所采用的主要研究方法。对国内外有关跨文化沟通、跨文化冲突和文化差异的理论进行了研究；研究了中德现场工程师跨文化冲突的表现，从团队内部、客户和供应商三个方面，通过具体的案例还原中德现场工程师中存在的跨文化冲突；通过对跨文化冲突表现的分析，结合具体的访谈及问卷调查，得出影响中德现场工程师跨文化冲突的原因，包括国家层面、企业项目层面以及个体层面三方面，国家层面从中德两国的国情与社会结构以及中德文化差异进行分析，企业项目层面从中德现场工程师跨文化沟通的特殊性、沟通渠道及跨文化培训三方面进行总结与分析，个体层面主要从个体背景与性格、语言和思维方式、价值观与信任、沟通方式与行为方式进行总结与分析；根据跨文化冲突的表现以及影响跨文化冲突的因素，结合具体的访谈及问卷调查，得出相应的应对跨文化冲突的建议以及一些可行的沟通技巧，主要包括文化和价值观、思维方式和沟通方式、跨文化沟通渠道、跨文化培训以及工程师的跨文化教育等方面。本文通过对西门子跨国团队内中德现场工程师跨文化冲突的分析，为其他跨文化团队更好的进行跨文化沟通、应对跨文化冲突以及进行跨文化合作项目提供一些参考。﹀
外文摘要：	︿ The Sino-German economic and trade cooperation has been an important part of the development of both countries for many years, however, due to great cultural differences, engineers with different background may encounter many intercultural conflicts during their work. Engineers play a very important role in the whole team, and the intercultural conflicts between them can affect the harmony of the team and the development of project. However, there are few researches on intercultural communication and intercultural conflicts between Chinese and German engineers both home and abroad. Thus the research on the intercultural conflicts between Chinese and German engineers is very important and meaningful theoretically and practically. Based on previous study and industry situation, this paper first analyzes the theory background and industry demand, introduces BBAC Project, Siemens team, main contents, research methods, and studies the related theories on intercultural communication, cultural difference and intercultural conflicts. Second, this paper researches on the intercultural conflicts from aspects of the team, client and supplier, which include conflicts of interest, value, emotion, cognition and target. Third, this paper analyzes the reasons that cause intercultural conflicts through a brief analysis of the intercultural conflicts together with interviews and questionnaires from aspects of the country, projects and individuals. The first one includes national condition and cultural differences, the second one includes the particularity of the project, communication channel and intercultural training, and the last one includes personal background and personality, language, the way of thinking, the way of solving problems, values, and trust. At last, according to the intercultural conflicts and the factors, through the interviews and questionnaires, this paper proposes some feasible suggestions and communication skills from aspects of culture and values, way of thinking and solving problems, communication channel, intercultural training, and intercultural education of engineers. Through the case study of Siemens international team, this research provides some references to other international teams in order to communicate better within different cultures and overcome the intercultural conflicts for a better development of the intercultural projects. ﹀
分类号：	H087/TP391
论文总页数：	107
参考文献总数：	42
馆藏号：	017/M2015(459)
公开日期：	2015-05-29

婚姻研究中文化负载词的汉译策略—以《为婚姻正名》为例.郭皓洁

链接

题名：	婚姻研究中文化负载词的汉译策略—以《为婚姻正名》为例
姓名：	郭皓洁
学号：	1201210586
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-29
关键词：	婚姻研究文化负载词文化翻译
论文摘要：	︿本翻译研究译本《为婚姻正名：为何已婚人士更幸福、健康、富有》是由芝加哥大学社会学教授琳达·韦特与美国保守派专栏作家玛吉·加拉格尔合著完成，并于2000年出版问世。著者通过对现有百余篇权威社会调研成果的数据分析整理、婚姻家庭类受访案例剖析，悉数前人婚姻理论的纰漏及众口铄金的五大婚姻谣言，从生理、心理、经济、法律、政策等维度例证婚姻的必要性，提倡以全新的婚姻范式走出昔日误区并重塑积极的婚姻观。书中出现了很多婚姻文化负载词，然而鉴于中西传统婚俗观、现行婚姻体制以及婚姻文化新词的迥异，源语言与目标语言间不能强制进行文化移植，从而这些带有文化烙印的词汇构成了婚姻研究类文本的翻译难点。本文基于《为婚姻正名》一书的翻译体验，针对婚姻研究中文化负载词的处理提出两大翻译原则：遵循研究者的婚姻视阈和置身源文本的时空指涉。在上述原则的指导下，笔者将该类文本中的文化负载词分为传统典故类、现行文化专有名词类以及新兴文化专有名词类。并针对上述三种类型提出相应的翻译策略，即传统典故类采用文化直现、文化替换以及文化过滤策略；现行文化专有名词类采用参照标准、比较与选择、推理与验证的策略；新兴婚姻词类采用合理仿译、批判与创造的策略。笔者通过使用翻译实践的实例对策略进行阐述，说明策略的有效性，以期为同类书籍的翻译提供借鉴。鉴于翻译书籍的社会学研究性质，本文在对翻译实例的分析过程中，尤其注意对该领域内专业知识的详细说明，旨在增强译者的专业背景知识，加强婚姻研究领域的中西学术交流。﹀
分类号：	H087/TP391
论文总页数：	189
参考文献总数：	40
馆藏号：	017/M2015(510)
公开日期：	2015-05-29

平行文本在社科类著作翻译中的应用——以The Children of Chinatown的翻译为例.祁红坤

链接

题名：	平行文本在社科类著作翻译中的应用——以The Children of Chinatown的翻译为例
姓名：	祁红坤
学号：	1201210764
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	外国语学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	Application of Parallel Text in Translation of Social Science Works--A Case Study of the Translation of The Children of Chinatown
关键词：	平行文本社科翻译翻译策略
外文关键词：	Parallel Text Social Science Translation Translation Strategies
论文摘要：	︿本文基于对The Children of Chinatown的翻译实践，该书由美国学者温迪·劳斯·杰瑞创作完成，于2009年由北卡罗来纳大学出版社出版，讲述了1850—1920年间美国旧金山唐人街的故事。作者通过各种途径调查了美国旧金山早期唐人街的形成和发展过程，专门研究了华人儿童的生活场景，旨在探讨儿童在早期唐人街的重建中发挥的巨大作用，唤起人们对早期儿童移民的关注。书中涉及大量历史文化内容，包括华人移民问题以及种族问题，给翻译造成了难点。因此，笔者在翻译中选取了几部关于美国华人的著作为平行文本，比如，《金山路漫漫》、《唐人街：共生与同化》及《纽约唐人街》等，并参考书中内容展开此次翻译项目。从话题和体裁方面看，这几部著作与笔者所译书籍都存在很高的关联度，能在多方面为译者提供借鉴，例如可以迅速补充译者所需的专业知识，规范译者的表达方法。另外，笔者提出了选取平行文本时所应遵循的依据，以及据此采取的翻译策略，即规范术语、借鉴表达和添加注释。最后，本文对平行文本在翻译中的运用进行了总结，以期为同类书籍的翻译提供参考。鉴于翻译书籍的特殊性，本文在阐释翻译实例时注重分析文化背景知识，尤其是19世纪中期的美国社会背景和历史知识，旨在增强译者和读者的文化素养。﹀
外文摘要：	︿ This research is based on the translation of The Children of Chinatown which was written by Wendy Rouse Jorae and published by the University of North Carolina Press in 2009. Delineating history of the early Chinatown in San Francisco, America from 1850 to 1920, the author investigates its formation and development through the daily life of Chinese children. The book aims to discuss the important role Chinese children played in the construction of Chinatown, thus drawing people’s attention to early immigrant children. As it contains much information about history and culture, including Chinese immigration and racial discrimination, this book poses great challenges to the translator who is not necessarily an expert in the field. Using parallel texts in translation therefore can quickly familiarize translators to their translation task and standardize their expressions. This research uses as parallel texts the following books as they are highly related to the source book in topic and form, such as A Place Called Chinese America, Chinatown: A Study of Symbiosis and Assimilation, and Chinatown, New York, both the original and the translated versions. Besides, it discusses the standards of selecting parallel texts as well as puts forwards three translation strategies, namely standardizing terms, borrowing expressions, and applying annotations. Given the particularity of this translated text, the thesis lays emphasis on analyzing its cultural background, especially the social and historical context of America in the 19th century so as to improve the translator’s literacy. ﹀
分类号：	H087/TP391
论文总页数：	175
参考文献总数：	27
馆藏号：	017/M2015(523)
公开日期：	2015-05-29

财经类通俗读物中人称代词的翻译策略——以Brilliant Accounting一书的汉译为例.魏宁

链接

题名：	财经类通俗读物中人称代词的翻译策略——以Brilliant Accounting一书的汉译为例
姓名：	魏宁
学号：	1101210959
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	THE TRANSLATION STRATEGIES OF PERSONAL PRONOUNS IN POPULAR ECONOMIC AND FINANCIAL BOOKS—TAKING THE TRANSLATION OF BRILLIANT ACCOUNTING AS EXAMPLES
关键词：	人称代词翻译策略财经类通俗读物
外文关键词：	personal pronouns translation strategy financial and economic popular books
论文摘要：	︿人称代词是日常对话、书面表达中非常常见的一种代词。然而，人称代词的使用虽然常见，但用法却复杂多样，具有泛指、虚指等非常规用法。人称代词的使用对语篇的感情色彩、语义表达、语用效果等都有不小的影响。另外，英汉两种语言中人称代词的用法也有所差异。本文侧重翻译实践，重点研究财经类通俗读物Brilliant Accounting— Everything You Need to Know to Manage the Success of Your Accounts一书中人称代词的翻译。大量使用人称代词是本书的一大特点，且书中的人称代词在缩短与读者的距离、增强与读者的互动性方面发挥着重要作用。再加上英汉两种语言中人称代词的用法、使用频率有一定差异，因此，人称代词的翻译是本书翻译的一个重点，也是一个难点。本文综述了英汉两种人称代词中的用法，分析了英汉两种语言中人称代词的用法差异，再结合财经类通俗读物的文体特点和人称代词的使用特点，借鉴了多本财经类通俗读物中人称代词的翻译方法，以Brilliant Accounting— Everything You Need to Know to Manage the Success of Your Accounts一书为例提出了财经类通俗读物中人称代词的翻译原则和翻译翻译策略。﹀
外文摘要：	︿ Personal pronouns are one of common pronouns in both daily communication and writing. Common as they are, their usages can be various, which includes bleaching, reference generality and so on. Their usages have great impact on emotional coloring, pragmatic effect of the passage. This paper focuses on the translation practice of personal pronouns in a financial popular book named Brilliant Accounting— Everything You Need to Know to Manage the Success of Your Accounts. This book features large numbers of personal pronouns, which plays an important role in shortening the distance between the author and readers, and improving interactivity between them. Again, the usages and frequency of personal pronouns in English differ from that in Chinese. Therefore, the translation of personal pronouns is one of the difficulties of translating this book. This paper first summarizes the usages of personal pronouns in Chinese and English. Then, it analyses the differences of usages of personal pronouns between in Chinese and in English. Combining the stylistic features and the characteristics of usage of personal pronouns in financial popular books, and drawing on the experiences of translations of other popular economic and financial books, this paper put forward 3 translation principles and 6 translation strategies to translate personal pronouns in financial and economic popular books. ﹀
分类号：	H087/TP391
论文总页数：	198
参考文献总数：	0
馆藏号：	017/M2015(535)
公开日期：	2015-05-29

互联网技术科普书籍中插图文本的翻译策略.刘雨萌

链接

题名：	互联网技术科普书籍中插图文本的翻译策略
姓名：	刘雨萌
学号：	1201210722
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	Translation Strategies of Texts in Illustrations of Popular Science Books on Internet Technology
关键词：	科普书籍互联网技术插图文本翻译策略
外文关键词：	Popular science books Internet technology Texts in illustrations Translation strategy
论文摘要：	︿本文是基于Advanced Google AdWords一书翻译的研究报告。笔者在进行Advanced Google AdWords一书的翻译过程中，发现插图在互联网技术科普书籍中起着至关重要的作用，它与正文本各司其职、紧密配合，一同将书中的内容进行阐释和描述，是互联网技术科普书籍中一个非常重要的组成元素。因而在互联网技术科普书籍的翻译中，插图文本的翻译极其重要。在我国，互联网技术科普书籍的翻译已经发展了一段不短的时间，但是对于该类书籍中插图文本的翻译还处于起步阶段，存在以下几个问题：一是任何体裁的插图文本的翻译都没有引起过译者的重视，所以互联网技术科普书籍中的插图文本翻译没有有价值的可借鉴经验。二是互联网技术科普书籍中插图文本的功能、形式与特点的研究都十分不足。三是互联网技术科普书籍中插图文本的翻译本身没有系统的理论指导，译者各自为政。为了解决上面的三个问题，本文从为数不多的可借鉴的材料出发，对Advanced Google AdWords中出现的大量插图文本进行分析研究，首先探讨了科普书籍的定位、特点、语言风格。然后探究了插图文本的定义、分类、形式与功能，将二者与作者使用插图的目的有机结合起来，联系自己的翻译实践，将互联网技术科普书籍中的插图功能总结为概括标示功能、强化印象功能、补充文本功能、加深认知功能，并提出了的互联网技术科普书籍中插图文本的翻译策略——互文法、不译法、以图译图法、注释翻译法。﹀
外文摘要：	︿ This paper is a research report based on translation of Advanced Google AdWords. During translating this book, I found the illustrations working with texts closely to interpret and describe the contents in books and playing such an important role in science popular books. Therefore, the text in illustrations is an indispensable part of science popular books and translation of that is very important. In China, the translation of popular science books on Internet technology has a long history. However, the translation of texts in illustrations in this kind of books is still in its infancy. There exist some problems. First, for any genre the translation of texts in illustrations did not cause the attention of translators. So there is no experience of value to refer to. Second, the study of function, form and characteristics of illustrations of popular science books on Internet technology is insufficient. Last, the theoretical guidance systems of texts in illustrations are not enough. In order to solve problems above, this paper firstly discusses the characteristics of localization, popular science books and language style. And then it explores the forms, functions and definition of texts in illustrations. Finally, the paper combines the two with purpose of author’s using illustration in the translation of Advanced Google AdWords. In this way, it concludes the function of illustrations of popular science books on Internet technology as: summary and marking, strengthening the impression, complementing texts and deepening cognition. On the basis of functions, the paper puts forward translation strategies of texts in illustrations of popular science books on Internet technology as intertextuality, no translation, illustrations translation and annotated translation. ﹀
分类号：	H087/TP391
论文总页数：	245
参考文献总数：	17
馆藏号：	017/M2015(587)
公开日期：	2015-05-29

基于人际意义的员工手册翻译策略研究——以 FCA Employee Handbook 2014 为例.何丹

链接

题名：	基于人际意义的员工手册翻译策略研究——以 FCA Employee Handbook 2014 为例
姓名：	何丹
学号：	1201210592
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
关键词：	员工手册翻译人际意义翻译策略
论文摘要：	︿员工手册是伴随着市场经济的产生与发展而诞生的一种日常经济文书。员工手册是企业规章制度、企业文化与企业发展战略的浓缩，同时也发挥着展示企业形象、传播企业文化的作用。它一方面需要体现企业宏观上的机构建制，规定具体的员工权利义务和工作操作规范，另一方面它还要为企业的日常工作运营提供法律等方面依据，并向员工传达企业的文化与精神。从企业的角度来看，员工手册可以成为企业有效管理的工具，而从员工的角度出发，员工手册是员工了解企业形象、认同企业文化的渠道，也是员工实际工作行为的参考指南。目前对于员工手册的研究集中在单语的写作规则以及相关阐释上，员工手册英译汉的翻译策略和规则却鲜少有人涉及。笔者选择以 FCA Employee Handbook 2014 节选文本为研究对象，采用定性和定量的研究方法，以 Halliday 系统功能语言学中的人际意义理论为指导，从语气系统、情态系统和人称系统三个层面进行分析，归纳并总结英文员工手册文本所实现的人际意义和具体表现特征。通过对统计数据的分析，笔者总结归纳出以下发现：在语气系统中，陈述语气的出现频率最高，其次是祈使句，疑问句出现的次数最少，此类语气结构的综合使用旨在向员工提供信息并寻求员工对企业机构的认同和支持；人际意义在情态系统中的实现主要依赖于情态功能词的使用，在英文员工手册所使用的情态功能词中，低值和中值情态助动词使用的次数相对较多，目的在于表现出尊重员工的态度并在书面交流中将员工放在与企业平等的地位，期待获得员工的认可；而在人称系统中，第一人称代词复数 we 和第二人称代词 you 使用最为频繁，其目的在于缩短企业与员工的社交距离，让员工产生更强的集体认同感与遵守企业规章制度的责任感,企业机构名称的适当使用则有助于明确企业制度规范中具体职责义务的划分与履行，并让企业机构员工更好地履行其应有的责任与义务。根据上述分析以及笔者的 FCA Employee Handbook 2014 节选文本英译中翻译实践，笔者提出了补译、顺译与倒译、使用无主语句式等多项翻译策略与原则，并利用翻译实践中的示例对于此类翻译策略的应用效果加以说明。笔者的研究有助于实现英文员工手册人际意义在中文译本中的完整传达，此外也可以为今后的员工手册翻译实践和相关研究提供一定的帮助。﹀
分类号：	H087/TP391
论文总页数：	177
参考文献总数：	18
馆藏号：	017/M2015(612)
公开日期：	2015-05-29

威尔斯翻译理论在本地化项目管理学术著作翻译中的应用.张海兰

链接

题名：	威尔斯翻译理论在本地化项目管理学术著作翻译中的应用
姓名：	张海兰
学号：	1101211053
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	The Application of Wilss`s Translation Theory in Academic Translation of Localization Project Management
关键词：	语篇对等跨学科语内翻译语际翻译功能对等
外文关键词：	functional equivalence textual equivalence Science of Translation
论文摘要：	︿学术著作翻译在学术交流的过程中起着重要的桥梁作用，关于自然科学，文科以及社科类著作的翻译也并不少见，但是就复合领域如本地化项目管理这样融合工商管理，计算机科学和翻译三大领域的译著鲜有著述。其中的原因包括本地化是相对新兴的专业领域，尤其是在国内相关方面的学术著作屈指可数。笔者选取了美国翻译协会学术专著第16卷的Translation and Localization Project Management:Art of the Possible这本书，鉴于其跨自然科学和人文科学，专业术语多，夹杂文化负载词的特点，以及排比句和被动句的翻译难点，相应选取了现在翻译流派中威尔斯（Wolfram Wills）关于深层结构和表层结构相似，语际和语内翻译的理论框架，坚持翻译由语义语法普遍存在的深层结构进行导航，构建语义对等，语法对等以及反应对等。﹀
外文摘要：	︿ Academic translation acts as a bridge in the academic communication. In fact, translation in the areas of science or arts is not rare at all. But as to the disciplinary area like localization project management which combines business administration, computer science and translation, the translation study and research is not enough. One of the causes is that localization just booms in recent years, and the localization experience is not popularized at home as abroad. I choose Translation and Localization Project Management: Art of the Possible, which is volume 16 in translation thesis series compiled by American Translators Association. As source text covers multidisciplinary terms, culture-loaded words, parallelisms and passive sentences, I choose Wolfram Wilss’s science of translation as guidance in the translation work, which addresses the similarity between deep and surface structure of language. With the guidance of deep structure, the multidisciplinary academic text translation reaches grammar equivalence, meaning equivalence and response equivalence. ﹀
分类号：	H087/TP391
论文总页数：	237
参考文献总数：	22
馆藏号：	017/M2015(782)
公开日期：	2015-05-29

顺应论视角下英语插入语的翻译研究——以 Living and Dying With Cancer为例.马占领

链接

题名：	顺应论视角下英语插入语的翻译研究——以 Living and Dying With Cancer为例
姓名：	马占领
学号：	1201210743
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-29
外文题名：	Analysis of English Parenthesis Translation from Perspective of Adaption Theory
关键词：	插入语顺应论插入语翻译顺应论与插入语
外文关键词：	parenthesis parenthesis translation theory of adaption
论文摘要：	︿翻译项目基于《与癌症同生共死》一书，作者安吉拉˙阿姆斯特朗˙科斯特，该书讲述了作者连续六年多次采访三十名患者，但是着重讲述了其中的十二位，因为他们的经历更能体现死亡过程中的一些细节。在翻译此书的过程，笔者发现书中出现了大量插入语，种类繁多，形式各样，翻译的难易程度各不相同。于是，笔者决定以插入语为研究对象，以顺应论为指导来研究插入语的翻译策略，以实现插入语翻译在文章中实现语篇衔接、语言结构、读者的物理世界、社交世界以及心理世界的顺应。本文以《与癌症同生共死》一书中出现的插入语为研究对象，以顺应论为指导，从以下几个方面进行了研究：一，翻译书籍简介、选题背景、研究目的和论文结构；二，文献综述；三，顺应论视角下顺应汉语语言语境和语言结构的翻译策略；四，顺应论视角下顺应中国读者交际环境的翻译策略；五，结论。立足于英汉结构和文化坏境的不同，并在分析实例的基础上，本文总结出以下翻译方法，包括顺承法、包孕法、重组法、拆离法、注释法、直译法、意义法和综合法等。以顺应论为指导可以使汉译后的插入语更加符合汉语的句法结构，篇内衔接更连贯、更紧密。同时，还可以顺应中国读者的物理世界、心理世界和社交世界。在顺应论的指导下，插入语的翻译不仅可以达到语言结构的完整还可实现文化的交流，引起中外读者的共鸣。﹀
外文摘要：	︿ This translation project is based on Living and Dying with Cancer written by Angela Armstrong-Coster and published in 2004. This book proposes that Angela interviews 30 patients in more than six years, but she emphasizes the twelve of them, because their situations can specifically reflect some details in the dying journey. During the course of translating, the author finds that there are mounts of and variety of English parenthesis in the book. Therefore, the author decides to study the translation strategies of English parenthesis under the guidance of Adaption Theory so that the translation can achieve textual cohesion and the adaption to language culture, reader’s physical, social and mental world. Targeted with the English parenthesis in Living and Dying with Cancer and under the guidance of Adaption Theory, this paper makes a research from the following aspects: first, introduction of the book, background, purpose and structure; second, literature review; third, translation strategies to adapt to the Chinese linguistic context and language structure; fourth, translation strategies to adapt readers’ communication context; fifth, conclusion. Based on the different structures and cultural environment and the analysis of cases in the book, this paper summarizes the following strategies, includes sequential translation, inserting, recasting, embedding, splitting-off translation, annotation and so on. Under the guidance of Adaption Theory, translation can conform to the Chinese syntactic structure and coherence. In addition, it can achieve the cultural communication and cause resonance between Chinese readers and foreigners. ﹀
分类号：	H087/TP391
论文总页数：	183
参考文献总数：	30
馆藏号：	017/M2015(342)
公开日期：	2015-05-29

2015-05-28

基于条件随机场的用户查询日志中的影视类命名实体识别.李高扬

链接

题名：	基于条件随机场的用户查询日志中的影视类命名实体识别
姓名：	李高扬
学号：	1201210645
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-28
关键词：	自然语言处理命名实体识别用户查询日志影视领域
论文摘要：	︿伴随着信息产业的蓬勃发展，人们日益增长的信息要求和网络上海量数据杂乱无章的现实产生了很大的矛盾。为了解决这个问题，各大互联网公司都在积极地对网上的海量信息进行处理。这就使信息抽取成为当今业界一项非常核心的技术，信息抽取的一个子领域是命名实体识别——Named Entity Recognition。实体识别定义为将文本中的预定义好类别的特定名词词汇定位出来。本人在一家互联网公司下的视频垂直搜索部实习，所从事的工作着眼于提升搜索引擎的用户体验。能否把用户查询串中的实体精准识别出来，对理解用户意图有着重要的意义。本文是基于条件随机场模型的对用户查询串中的影视类命名实体进行识别。本文是要识别电影、电视剧、电视节目、动漫四种影视类实体。电影如《超能陆战队》《拯救大兵雷恩》，电视剧如《家族的秘密》《红粉世家》，电视节目如《美丽中国》《挑战者》，动漫如《金刚葫芦娃》。条件随机场是一个可以融合多个并非独立的特征的模型。用户查询串是短文本的一种，它的信息少、不规范、容易受噪音影响。本文利用实体公司一个月的用户查询日志，编写Hadoop应用程序进行了转化和清洗。接着利用关联分析的方法从清洗后的用户查询日志中构建了一个实体词典。实体识别在训练模型阶段，需要对语料做大量的标注工作，本文利用自动结合人工的办法对用户查询串做了实体标注。在应用条件随机场模型的过程中，本文抽取了五类特征：用户查询串的字符特征、用户查询串的词语特征、实体词典特征、实体前后缀特征以及纠错近似度标识特征。其中纠错近似度标识表征了两个词语之间的易混淆程度。在构建实体前后缀词典的时候为了降低算法的时间复杂度，使用了字典树的数据结构。在计算纠错近似度标识的时候，使用了编辑距离概念并且构建了一套挖掘近义词的流程。依次增加词典特征和前后缀特征、纠错近似度标识特征之后进行对比实验。实验结果证明了引入的外部词典信息一定程度上能够改善短文本的信息缺失；引入纠错近似度标识特征能够降低短文本噪音干扰，明显提升识别召回率。与前人的算法做对比，证明本文提出的一套框架能够在时间效率和识别效果上有明显的优势。实体识别完成之后，为了解决实体歧义的问题，本实验又构建了基于属性抽取的实体链接模块。针对搜索串幂律分布的特点，本实验构建了缓存模块以提高识别速度。最后本实验构建了一个识别视频的集数、季数、期数的模块。﹀
分类号：	TP311.52
论文总页数：	61
参考文献总数：	0
馆藏号：	017/M2015(493)
公开日期：	2015-05-28

基于认知心理的计算机辅助翻译工具界面探索.郑江锋

链接

题名：	基于认知心理的计算机辅助翻译工具界面探索
姓名：	郑江锋
学号：	1201211042
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-28
外文题名：	Research on Interface of Computer Aided Translation Tools Based on Cognitive Psychology
关键词：	软件界面翻译认知眼动追踪按键记录记忆库提示
外文关键词：	User interface Translation cognition Eye-tracking Key-logging TM prompt
论文摘要：	︿伴随着现代语言服务行业的蓬勃发展，以技术文档为主要翻译素材的专业译者面临着翻译任务重、时间紧、翻译内容一致性要求高等挑战，因此，计算机辅助翻译工具正扮演着越来越重要的角色。然而，目前市场上的计算机辅助翻译工具种类繁多，很多软件界面元素堆积拥挤，大大降低了译者的翻译效率。本文从专业译者的认知心理角度考虑，以计算机辅助翻译工具界面元素为研究对象，旨在提出对译者认知负担最小的软件界面设计建议，以帮助译者最大程度提高翻译效率，改善翻译质量。同时，本文也致力于为计算机辅助翻译软件的界面设计提供新的方法借鉴和参考。本文围绕翻译记忆库原文提示模式、翻译记忆库译文提示模式、翻译工作区和翻译记忆库的位置摆放展开讨论，以眼动追踪和键盘按键记录为研究手段，设计了三大实验。实验一融入了记忆库原文提示模式（增删改、高亮不匹配和高亮匹配）和待编辑译文提示两大考察变量；实验二主要研究译者在做质量检查时认知负担较小的原文译文摆放模式；实验三主要探究对译者最佳的的记忆库和工作区摆放模式。实验主要记录了被试在后编辑过程中凝视时间、注视次数、停顿时间、各兴趣区注视占比及眼跳次数等数据以分析被试在各个考察变量下的不同认知状态。通过对眼动数据和按键记录数据的线性回归分析，实验一发现记忆库原文的不同提示模式主要引起被试对不同兴趣区的注视程度差异，增删改模式在翻译质量上明显优于其他二者，但不同提示模式对译者的认知负担影响与建议句段和待翻译句段之间匹配程度是否涉及语法结构变化有关；在记忆库译文提示方面，我们发现当待编辑句段的提示正确率不到70%时，会引发译者对记忆库原文更多的关注，从而延长译者的翻译用时。实验二发现当原文译文上下摆放时，对译者质量检查时的认知负担受原文行数影响较大，左右模式相较之下受行数的影响较小。实验三发现记忆库和工作区原文译文的不同摆放位置对译者在后编辑时的影响不大，记忆库和工作区原文译文的呈现方式差异在翻译时可以被忽略。﹀
外文摘要：	︿ With the fast development of modern language services industry, professional translators who mostly translate technical documents are facing more challenges, such as heavy tasks in short period of time and strict requirement on the translation consistency. Therefore, Computer-aided translation (CAT) tools are playing a more important role than ever. However, among the various CAT tools in the market, most of them have crowded interface, which reduce translators’ working efficiency. From the perspective of professional translators’ cognition, with the interface elements of CAT tools as researching objects, this paper is aimed to propose interface design suggestions for CAT tools so as to reduce the translators’ cognitive load, increase translation efficiency and improve translation quality. Meanwhile, this paper also provides a new way for CAT tools interface design and reference. With eye-tracking and key-logging as researching method, this paper designs three experiments to discuss the prompt mode of the Translation Memory Source Text (TMST), the effect of the prompt in suggested target text, as well as the location of the Translation Workplace and the Translation Memory. Experiment 1 aims to explore the best prompt mode of the TMST and the effect of the prompt in suggested target text. Experiment 2 aims to explore a better Source Text and Target Text location that causes smaller cognitive load of translators when when they are doing Quality Assurance work. Experiment 3 aims to find out the best location of the Translation Memory and the Working Place for post-editing. To investigate the participants’ cognition, the researcher records their gaze time, fixation counts, pause time, fixation proportion in different areas of interest (AOIs), and saccade counts, etc. By analyzing eye-tracking and key-logging data with linear regression, experiment 1 suggested that different prompt modes of TMST caused the difference of fixation in different AOIs and the translation quality using the prompt mode of SDL Trados was better than the other two. The grammatical match degree between the suggested sentence in Translation Memory and the new sentence to be translated decided which prompt mode perfomed best about the cognition. With regard to the prompt of suggested target text, we found that the prompt would cause participants’ more time and counts of fixation on source text, thus making them spent more time on translation when the prompt accuracy of the suggested target text was below 70%. Experiment 2 showed that the up-down mode of source text and target text was more sensitive to the number of lines of source text than the left-right mode in terms of cognition when participants were doing the work of quality assurance. Experiment 3 showed different locations of Translation Memory and Translation Workplace had little impact on the participants who were doing post-editing, so its influence could be ignored in the process of translation. ﹀
分类号：	H087/TP391
论文总页数：	95
参考文献总数：	72
馆藏号：	017/M2015(503)
公开日期：	2015-05-28

分布式系统的升级和数据迁移问题研究.黄礼骏

链接

题名：	分布式系统的升级和数据迁移问题研究
姓名：	黄礼骏
学号：	1201210610
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-28
外文题名：	The Research and Implementation of Software Upgrades and Migrations for Distributed Systems
关键词：	云计算分布式系统升级数据迁移
外文关键词：	Cloud Computing Distributed System Upgrade Migration
论文摘要：	︿随着社交网络、移动互联网、电子商务等技术的不断发展，互联网的使用者们贡献了越来越多的内容。为了处理这些内容，每个互联网公司在后端都有一套成熟的分布式系统用于数据的存储、计算以及价值提取[1]。不仅如此，许多银行、政府机构、大型企业对大规模数据存储与处理都有着强烈的需求，这也解释了为什么时下云计算这么火热并且持续被看好。然而，构建这种系统的软件往往很复杂，同时，每隔一段时间，系统都需要进行升级以提升性能、修改错误或增加新的功能。本文工作所面临的最基本的问题就是如何有效地设计一种工程上可行的升级方案，使得这样规模的分布式系统能得到正确的升级和数据迁移，同时能持续对外提供服务。这个问题的最有趣的一点是，大规模的分布式环境的升级过程往往很长，甚至升级过程需要持续数月。而在这个过程中，有些机器节点已经升级到新的版本，有些却仍处于旧版中。所以，本文要解决这个问题，最为关键的一点在于能够设计一种方案，使得分布式系统能在新旧版本共存的混合模式下继续对外提供服务。本文设计并且实现了一种自动化的升级和数据迁移的方案，解决了如上所述的问题。并且，本文还做了以下工作：1、分析了分布式系统升级问题的难点。2、设计了较为通用的分布式系统升级的总体设计方案。3、设计并且实现了在单个数据中心的分布式环境里的升级方案。4、设计并且实现了在跨数据中心的分布式环境里的升级方案。5、设计并且实现了较为全面和完整的解决方案。6、提出了创新点，诸如回调函数式的数据迁移等方案。本文所做的工作虽然有限定在一定的技术背景的分布式环境中进行，但因为目前各种主流的分布式系统均有很多的共性和相似点，故依旧可以作为较为通用的解决方案。本文也由衷地希望能给之后涉及到该问题的研究者和工程师一定的参考和启发。﹀
外文摘要：	︿ With the rapid development of Social Network, Mobile Internet, Electronic Commerce and so on, the internet users contribute more and more contents. To handle these contents, robust distributed systems developed by every internet companies are servicing for storing, computing, valuing these data. More than that, many banks, governments and major enterprises have strong requests for the large-scale data storage systems, this is why the cloud computing is so hot and has a bright future at present. However, it is very difficult to develop and build these large-scale distributed systems, meanwhile, they need to do upgrades for fixing bugs, adding features and improving performance. Yet while a system is upgrading, it must continue to provide service to users, so the basic problem for this paper is how to design a feasible upgrade approach that could upgrade and migration the software of long-lived, highly-available distributed systems. The most interesting thing of this problem is, it is not possible to upgrade all the nodes in a system at once, since some nodes may be unavailable and halting the system for an upgrade is unacceptable. So the risk is between two product versions, different kinds of changes might happen which might make the new version software not compatible with old version any more, service interruption happens then. Instead, upgrades must happen gradually, and there may be long periods of time when different nodes run different software versions and need to communicate using incompatible protocols. We present a methodology and infrastructure that make it possible to upgrade distributed systems automatically while limiting service disruption. It is a layer-based upgrade framework, in which each service instance can be updated inline as the same slot and will retain the same persisted state. The upgrade will take advantage of the lifecycle load balancer, ownership and relocation to be graceful so that service non-disruptive. Besides, we have below efforts: 1. Analyses the challenges of upgrading distributed systems. 2. Design a compatible upgrade architecture of upgrading distributed systems. 3. Design and implement the upgrade and migration of the local data center distributed systems. 4. Design and implement the upgrade and migration of the multi data center distributed systems. 5. Design and implement a whole solution for this problem. 6. Propose some innovation ideas, such as the migration with callback method. In general, although our efforts are limited of the special technology background distributed system, for the main distributed systems are sharing some design or architecture, we think other engineers who are involved in this problem could find inspiration from this paper. This is the all we are expecting. ﹀
分类号：	TP311.5
论文总页数：	67
参考文献总数：	0
馆藏号：	017/M2015(543)
公开日期：	2015-05-28

基于复述及语义分析的智能问答系统.黎槟华

链接

题名：	基于复述及语义分析的智能问答系统
姓名：	黎槟华
学号：	1201210640
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-28
外文题名：	Paraphrase and Semantic based Question Answering System
关键词：	问答系统知识库复述问句生成
论文摘要：	︿随着计算机和互联网的不断发展，网络信息资源呈爆炸式飞速增长，互联网用户如何在海量的信息中准确、快速的获取个人所需要的特定信息成了难题。搜索引擎技术虽然在一定程度上提高了人们获取信息的效率，但仍然有一些明显的缺陷：1）传统的搜索引擎基于关键字匹配算法，返回的是可能存在目标信息的网页地址，使得用户仍需要在这些返回的网页中浏览、筛选以获取特定信息；2）受到日常生活中长期使用自然语言的影响，人们习惯在搜索中使用问句，然而传统搜索引擎对问句缺少语义上的理解，致使很多语义不相关的网页也被返回给用户。于是问答系统应运而生，用户可以使用自然语言向问答系统提问，系统首先对用户的搜索意图进行理解，然后搜索资源、集成信息，最后生成答案，直接返回用户所需信息。相比浅层的、基于关键字的搜索引擎技术，深层的、基于语义的问答系统能更准确的捕捉用户的信息诉求，也进一步提高了用户获取信息的效率。在众多类型的问答系统中，基于知识库的问答系统脱颖而出，受到越来越多的研究关注。知识库的问答系统主要依赖于大型的结构化的知识库，如Freebase，若把知识库比喻成一个巨大的、联通的图，节点就是图中的实体，如人物、事件等；边是实体间的关系。有了结构化知识库，问答系统的任务是对用户基于自然语言的查询问句进行理解，然后将该查询中的实体映射到知识库的实体节点，问句所问的关于该实体的“方面”映射到边，接着将这条边所通向的另一个实体作为结果返回，最后再合成自然语言将答案输出。本文在前人的问答系统基础上，提出若干改进方案。本文主要工作包括：1）在重现前人问答系统的基础上，分别在实体识别、实体链接、问句生成等模块中，提出了自己的改进方案。2）提出了基于谓词过滤、知识库类型系统过滤等模块，提升了系统性能；通过这些改进，有效的将系统F值从39%提高到42%。3）本文总结了多个问答系统的构建、性能调试等经验，为问答系统提出了新的评价指标QMRR，改进后的系统在该指标下取得了48%的得分。﹀
外文摘要：	︿ With the continuous development of computer and Internet, the internet information and resources are growing rapidly. How to obtain specific information for Internet users fast and accurate in those vast amounts of information becomes a problem. Although search engine technology improved the efficiency of information acquisition, but there are still some obvious defects: 1) traditional search engines based on keyword matching algorithm, returns the Webpage. Users still needs to browse specific information in these Webpage; 2) due to the effect of long-term use of natural language in daily life, people are accustomed to use long natural language question query, but the traditional search engine is lack of semantic understanding of those questions. So the question answering(QA) system emerge as the times require, users can use natural language to answer questions on the system, the system first to understand the user's search intention, and then search for resources, information integration, and finally generate the answer, return directly what users needed. Compared to the search engine technology based on keywords, question answering system base on shallow semantic understanding can more accurately capture the user needs, but also further improve the efficiency of user access to information. Based on the previous question answering system, and puts forward some improvement scheme. The main work of this paper includes: 1) improves the previous question answering system’s entity recognition module, entity linking module, question generation module and so on. 2) build a predicate filter module, knowledge type filter module, and improve the system performance; through these improvements, the F value increased from 39% to 42%. 3) Bases on the experience of analysis and debug multiple question answering system, we put forward the new question answering system evaluation index QMRR, and our system achieved a score of 48% in this index. ﹀
分类号：	TP391
论文总页数：	54
参考文献总数：	35
馆藏号：	017/M2015(566)
公开日期：	2015-05-28

智能电子词典系统的研究和实现.厉海洋

链接

题名：	智能电子词典系统的研究和实现
姓名：	厉海洋
学号：	1201210684
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-28
关键词：	电子词典词义消岐用户建模个性化推荐
论文摘要：	︿各种形式的电子词典系统在很大程度上取代了传统的纸质词典。但是现有的电子词典系统在形式上和内容上与传统纸质词典相比并无太大进步，往往只是纸质词典的电子版而已。这些电子词典没有充分发挥现有软件系统和智能技术的作用，在使用上给用户带来了不必要的外在认知负荷。为了解决现有电子词典存在的问题，本文提出了构建一个智能电子词典系统的设想，并从三个方面对电子词典的智能化进行了研究和探索。首先，为了解决电子词典系统内容不足、内容不新的问题，本文对如何通过Web来自动构建大规模的语言资源数据库进行了研究和探索，并探讨如何在智能电子词典系统中对这些自动收集的语言资源加以有效利用。其次，本文研究了词义消歧在智能电子词典中的应用。主要是利用这些算法对单词的词义进行智能识别，找到最符合当前上下文语境的词义，然后根据识别的结果对词典义项进行排序，从而减轻用户的认知负荷，提高用户的学习效率。实验表明，本文提出的词义消歧系统其性能可以与Senseval评测得分最高的系统相比较，能够在用户查询词典的过程中发挥有效作用，提高词典系统的用户体验。最后，本文对用户兴趣建模在电子词典系统中的应用进行了探索，提出了Semantic-LDA模型，并利用该模型计算用户兴趣和例句的主题分布，研究了如何利用GBDT对各种特征进行融合，然后在此基础上进行排序学习，从而完成对用户的例句推荐。实验证明，GBDT方法有效地融合了特征，在NDCG和MAP指标方面都强于单一特征的方法。﹀
分类号：	H087/TP391
论文总页数：	52
参考文献总数：	0
馆藏号：	017/M2015(754)
公开日期：	2015-05-28

基于协同训练的专利文本分类.位明旭

链接

题名：	基于协同训练的专利文本分类
姓名：	位明旭
学号：	1201210856
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-28
关键词：	文本分类协同训练专利文献
论文摘要：	︿专利文献对于产业主体具有重要的商业和战略价值，而专利的分类是组织和利用专利文献的基础性工作。已经有很多研究将机器学习方法应用于专利文本分类并做了针对性的改进，而在自定义类目的专利文分类实际任务中，很难获得大规模的标注专利文本。本文重点关注用小规模训练集进行专利文本分类的问题，在现有专利文本自动分类研究的基础上，更进一步结合专利文本结构特征鲜明、著录项目信息完备的特点，提出了一种针对专利文献改进的协同训练文本分类策略。与普通方法和参照方法相比，该方法在给定极少量训练标注专利文本情况下，能够获得更准确、稳定的分类表现。针对专利文献中摘要、权利要求书、说明书三部分结构鲜明、内容充分、差异明显的特点，本文首次将协同训练方法引入专利文本分类的任务中，并对协同训练的方法进行改进，提出了一种基于三视图的改进协同训练方法。此外，本文在结合专利文本特征上做了进一步尝试，从各项著录项目信息的意义上详细研究其对分类的作用，然后通过实验进行验证，比较各种部分特征组合在分类任务中的效果。根据得到的结论选择性的利用著录项目信息，利用协同训练过程得到的较为准确的自标注实例，以解决小规模训练集下利用著录项目信息的困难。本文还基于对引用关系的探究，提出了一种利用引用专利内容的分类方法。实验结果显示，改进的协同训练方法能够在互相推荐自标注实例的过程中减少标注错误，最终的融合模型在分类效果和稳定性上都明显优于自训练方法、半指导期望最大化方法以及参照方法。在初始分类器效果不好的情况下，改进的协同训练方法能够更好的保证自标注的准确率，这为在极少量训练样本的情况下利用著录项目信息提供了可能。实验结果显示，选择性的利用著录项目信息，以及使用本文对引用专利内容的利用方法，能够在协同训练的基础上进一步提升分类效果。在本文的实验设置下，最终融合上述方法的分类策略较好的解决了在给定极少量训练标注专利文本情况下的专利文本分类问题。﹀
分类号：	TP391.7
论文总页数：	61
参考文献总数：	0
馆藏号：	017/M2015(328)
公开日期：	2015-05-28

结合思维导图的MOOC学习路径设计与应用研究 ——以”计算机辅助翻译原理与实践”课程为例.何美伊

链接

题名：	结合思维导图的MOOC学习路径设计与应用研究 ——以"计算机辅助翻译原理与实践"课程为例
姓名：	何美伊
学号：	1201210595
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2015-05-28
关键词：	MOOC 信息加工学习路径思维导图
论文摘要：	︿在科技面前，普通大众既是科技的发明创造者，也是科技成果的享有者，更是科技的忠实追随者。受益于科技变革的助力，教育技术从传统的手工技术为主开始，逐步经历视听媒体技术阶段，再到如今的信息化教育技术。作为媒体教育技术的先声代表，十七世纪捷克教育学家夸美纽斯提倡的直观教学。如今网络媒体悄然引领新一轮教育技术的浪潮，MOOC大规模在线开放课程的兴起，正逐渐影响全球亿万学习者的学习模式，甚至向传统线下教育模式以及传统在线视频教育发出了挑战。学习者们在接收全世界优质公开课程资源的同时，对如何提高在MOOC平台上的学习效果提出了更高要求。SPOC作为“后MOOC时代”的新型教学模式应运而生，为提高校内教学质量的探索打开新思路。本文以北京大学俞敬松老师的SPOC课程《计算机辅助翻译原理与实践》为案例研究对象，针对翻译技术教学在SPOC课程实践中面临的难点，即“知识定制化”与“信息碎片化”对学习效果提出的挑战，笔者试图从学习路径角度探索提高教学质量的有效方法。在“教学并重”，因材施教的教学理念指引下，本文研究集中于SPOC模式下如何帮助学生解决“学什么”和“怎么学”的问题：面向不同学业背景的学生如何进行课程内容裁剪；针对在线课程自主自由等特点如何指导学生设计有效学习过程。本文借鉴加涅的信息加工学习理论并参考思维导图作为教学工具在教学实践中的运用研究，在俞敬松老师的指导下，笔者作为SPOC研究助教，展开为时一个学期的教学研究，旨在从翻译技术课程的学习路径设计角度讨论思维导图在教育技术中的实际效用。通过文献研究、案例分析、调查问卷以及访谈等方法展开本文研究，笔者提出“基于思维导图的学习路径”设计方案并在教学中通过实践应用总结经验，主要成就点包括设计可视化的裁剪图示满足课程个性化定制；增加思维导图时间定位索引设计帮助提高知识回顾效率；利用可编辑的思维导图模板引导、规范学生从预习到复习反馈的学习过程。本文是对SPOC学习路径设计的一次创新，为探索新型SPOC教学模式下学习路径设计提供借鉴。﹀
分类号：	H087/TP391
论文总页数：	84
参考文献总数：	0
馆藏号：	017/M2015(347)
公开日期：	2015-05-28

2015-05-19

回译在英文技术写作教学中的应用研究.徐彬彬

链接

题名：	回译在英文技术写作教学中的应用研究
姓名：	徐彬彬
学号：	1401210791
专业：	计算机技术
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	李博婷
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2015-05-19
关键词：	回译技术写作技术写作教学
论文摘要：	︿随着经济全球化的快速发展，很多跨国企业开始在中国设立技术写作部门，中国技术写作行业也开始发展起来。然而国内技术写作发展起步较晚，技术写作人才短缺，难以满足当前的市场需求，国内技术写作教育任重而道远。然而学术界对开展技术写作教学的研究多停留在理论研究层面，对教学方法的实证研究寥寥可数。而当前国内技术写作教学现状存在种种问题，学生技术写作实践机会少，学生虽然在课堂内学习了技术写作的相关知识，却未能熟练掌握技术写作技巧。基于当前国内技术写作行业和技术写作教学现状，笔者提出将回译练习应用到英文技术写作教学中，以此帮助学生学习和掌握英文技术写作技巧，培养学生技术写作能力。回译（back translation）是指将已经翻译的内容再翻译成源语言文字的过程。回译常被用作一种翻译策略，或被用来评估译文质量。回译可用作教学辅助工具，并通过实证研究证明回译练习可以帮助学生提高汉英差异意识，提高学生翻译能力。因此，本研究将回译应用到英文技术写作教学中，在前人研究的基础上，结合企业内部人员的技术写作经验，有针对性地进行教学设计，确定适合回译教学的教学内容，开展了英文技术写作回译教学实验。实验表明，回译练习应用到英文技术写作教学具有可行性和有效性，学生也非常愿意采用回译练习方法进行技术写作实践练习。本研究认为回译练习有助于学生熟练掌握任务型主题文档的构成元素和逻辑结构；有助于培养学生进行产品分析的意识和习惯；有助于学生熟练掌握技术写作风格。其中，回译教学在帮助学生塑造技术写作简洁性方面效果比较显著。本文通过问卷调查还发现，同学们也对回译练习的教学效果持肯定态度。因此，将回译应用到英文技术写作教学中有助于提高学生的英文技术写作能力，回译有助于提高技术写作人员在初级阶段的写作能力。﹀
分类号：	H087/TP391
论文总页数：	83
参考文献总数：	48
馆藏号：	017/M2015(1049)
公开日期：	2015-05-19

2014-12-10

寻找理论到实践的切入点——谈翻译标准对翻译实践的指导意义.陈巧云

链接

题名：	寻找理论到实践的切入点——谈翻译标准对翻译实践的指导意义
姓名：	陈巧云
学号：	1001220609
论文语种：	chi
专业：	翻译硕士
公开时间：	公开
培养层次：	硕士
学位：	法学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	法学院
论文答辩日期：	2014-12-10
外文题名：	The Einstein Syndrome: Bright Children Who Talk Late
关键词：	翻译标准信达雅
外文关键词：	Einstein syndrome clever talk lat early intervention
论文摘要：	︿本翻译项目源文本取白《爱因斯坦综合症一一聪明的娃娃晚说话》一书，该书由美国著名经学家、斯坦福大学教授托马斯·索维尔教授(Thomas Sowell, 1930年6 月30 日一一)所著。索维尔本人就是书名中这样一个孩子的父亲，他通过对自己研究小组里的孩子进行观察研究，结合卡马拉特教授的研究数据，总结出晚说话的聪明儿童在性格行为特征上的一些共性，并结合自己的亲身经历和体会，写成此书，该书于2001 年发表，旨在让更多这类孩子的家长了解情况，消除他们的困惑和顾虑，对家长如何对待王麦子的这种情况给予一些合理的建议:提醒评估机构和干预机构所做一切应以对孩子负责为出发点，并对教育机构该如何教育这类孩子提出一些建设性的意见。全书共七章，另加前言和后记，涉及内容包括:家庭基因对孩子的影响、儿时曾有过言语滞后问题的成年人的个案实例、说话晚的孩子的行为特征、探索出现言语滞后问题的原因、对该类孩子的测试与评估、对言语滞后的早期干预、以及对可能会出现的不确定因素的处理等几个方面﹀
外文摘要：	︿ This translation project is based on The Einstein Syndrome-Bright Children Who Talk Late written by Thomas Sowell, noted American economist, now a senior researcher and a residential scholar in Stanford University. ﹀
分类号：	H059
论文总页数：	194
参考文献总数：	0
馆藏号：	039/M2014(146)
公开日期：	2014-12-10

2014-11-28

英文科普著作中的隐性连贯及其汉译.周梦洁

链接

题名：	英文科普著作中的隐性连贯及其汉译
姓名：	周梦洁
学号：	1001211049
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工学硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-11-28
外文题名：	Handling Implicit Coherence of Popular Science Translation
关键词：	科普著作隐形连贯翻译
外文关键词：	Popular science works Implicit Coherence Translation
论文摘要：	︿随着科学技术的发展以及科技产品的普及，科普著作逐渐成为一种常见文本，并且日益影响着人们的生活。科普著作的翻译不仅要体现出科普文本的准确性与严谨性，还要体现一定的文学性与通俗性。以往的科普翻译研究虽然也包括了许多对语篇连贯问题的研究，但是关于隐性连贯问题研究甚少。隐性连贯是相对显性连贯而言的语言现象。某些语篇虽缺乏明显的语法意义上的衔接纽带，比如逻辑连接词或词汇衔接，但是结合文化背景和其他语篇成分，可以得到其中的深层语义关系。这种没有明显“话语标记”的语篇连贯被称为“隐性连贯”。如何处理科普著作中的隐性连贯对于科普翻译研究以及相关策略的提出具有重大意义。在语篇连贯理论的指导下，本文以胡壮麟和张德禄对隐性连贯的相关研究为理论基础，总结了科普著作中的三种隐性连贯，分别是情景省略、背景省略及逻辑暗示。更进一步的，本文在自建语料库和参考语料库的基础上，运用AntConc 3.2.4语料库工具对科普文本中的隐性连贯进行定量及定性分析，既是对前人研究进行了较好的补充，也对自己的翻译工作具有明显的指导性意义。自建语料库使用的是笔者的翻译项目《技术交流》（Technical Communication）第5至8章的原文语料，参考语料库为美国国家语料库。分别对应三种隐性连贯模式，本文通过技术交流语域中的参与对象关键词来分析科普文本中的情景省略；然后通过写作对象关键词来分析科普文本中的背景省略；最后通过例证类连接词来分析科普文本中的逻辑连接。在定量及定性研究的基础上，本文针对这些隐性连贯模式及其语言特点，从情景语境、背景语境及逻辑连接的角度出发，提出相应的翻译策略，如重建情景语境、补充背景信息以及调整语序，以使目标文本达到深层次的语义连贯。事实证明，良好的语义连贯，对于提高科普翻译的用户接受度具有重要的意义。﹀
外文摘要：	︿ With the development of science and technology and the popularization of technological products, works of popular science are becoming a common text and beginning to exert influences on people's life day by day. The translations of popular scientific works have to be not only accurate and rigorous, but also literary and easy to understand. While former works of popular science contain many researches on text coherence, few are on implicit coherence. Implicit coherence is a linguistic phenomenon which is opposed to explicit coherence. Although some texts fail to offer overt cohesive ties in terms of grammatical meaning, such as logical conjuction or lexical cohesion, the deep semantic relation can still be revealed and understood with the help of cultural background and the other textual elements. Textual coherence without obvious "discourse markers" is referred to as implicit coherence. Methods on how to cope with implicit coherence in popular scientific works shed lights on researching popular science translation and providing relevant policies. Guided by the theory of text coherence, and based on relevant researches of Hu Zhuanglin and Zhang Delu, this paper summarizes three implicit coherences, which is: situational ellipsis, background ellipsis, and logic implication. What's more, this paper, based on the self-built corpus and reference corpus, conducts quantitative and qualitative analyses on implicit coherence through Antconc 3.2.4, a corpus tool, which not only serves as a supplement to former researches, but also shows apparent guiding significance in individual's future translations. Those linguistic materials related to self-built corpus are selected from the Chapter5 to Chapter8 of the author's translation project, Technical Communication, whose reference corpus is American National Corpus(ANC). Self-built corpus is respectively corresponding to three implicit coherence modes. This paper intends to analyze the situational ellipsis of popular science works through keywords of participating objects in technical communication field; and then it researches the background ellipsis through keywords of writing objects; finally it comes to logical connectives by illustrative conjunctions. Based on quantitative and qualitative researches, focusing on implicit coherence mode and its linguistic features, and setting off from the perspective of situational context, background context and logical connective, the paper puts forward relevant translation strategies to realize the further semantic coherence in the translated texts, such as reconstruction of situational context, complement of background information, and rearrangement of text order. It has been proven that good semantic coherence is significant in improving users’ acceptabilities of popular scientific works. ﹀
分类号：	H087/TP391
论文总页数：	223
参考文献总数：	29
馆藏号：	017/M2014(971)
公开日期：	2014-11-28

“self-”复合词的翻译研究.刘飞

链接

题名：	“self-”复合词的翻译研究
姓名：	刘飞
学号：	1101210795
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工学硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-11-28
外文题名：	A Translation Study on “Self-” Compounds
关键词：	“self-”复合词自我意识读者需求翻译策略
外文关键词：	“Self-“ compound Self Readers’ requirements Translation strategies
论文摘要：	︿复合词是英语词汇的重要组成部分。本文研究的是其中一个特例，由“self－”做前缀构成的复合词。“self－”复合词的针对性探讨不多，大体上归并为复合词的一个例子而已，没有对其进行深入研究。在翻译领域，“self”作为一个非常有特点的构词成分，对于“self-”复合词整体意义有较大的影响。“self”意味着“自我”，“自我”是哲学上的一个重要概念。中西方对“自我”的认知有很大的不同，这种不同对译者在翻译“self-”复合词这一类词时选用的翻译策略有所影响。本文以研究“self-”复合词的翻译策略为核心。首先从语言学、思想意识和翻译三个角度对“self-”复合词的现有研究进行了梳理总结和讨论，指出了当前学者们对“self-”复合词翻译策略研究的不足之处。为了让自己的研究有更广泛的视角，本文没有局限于自己的翻译作品进行讨论，而是引入了多个版本的经典译本，从英译汉和汉译英双方向进行思考。这一对比研究是从语言视角、思想意识和读者需求三个方面思考了不同时期、不同思想背景的译者在翻译“self-”复合词时所采用的策略，以及对自己的借鉴意义。本文最后结合《伍尔夫传》一书的翻译实践，提出了笔者对于“self－”复合词翻译的具体策略，并使用实例对策略进行论证和阐述，证明了策略的有效性，期望能对更广泛的英语复合词的翻译工作有所借鉴。﹀
外文摘要：	︿ Compound word is a very important part of English vocabulary. This paper is studying one of the particular cases, namely the compound words consisting of prefix “self”. However, the study focusing on “self-” compounds is not so much. It has generally been incorporated into the example of compounds without intensive study. In the field of translation, “self”, as a special formation part, has a greater influence on “self-” compounds on the whole meaning. “Self” means “ego”, which is an important concept in philosophy. However, there are big differences upon the cognition of its philosophical significance in Sino-Western culture, which result in the influence on selecting the translation strategies upon “self” translation for translators. This paper focuses on studying the translation strategies of “self-” compounds. Firstly, from the perspective of linguistics, ideology and translation, this paper summaries and discusses the current study on “self-” compounds, and points out the shortcomings of current study of translation strategies upon these compounds. In order to make the research with broader perspective, this paper is not limited to the discussions of the author’s own translation works but introducing multiple versions of classic ones, considering both from English to Chinese and Chinese to English. This comparative study is considering the strategies adopted by translators under different times and ideological backgrounds in translating “self-” compounds from the perspectives of language, ideology and readers’ requirements, as well as its implications. Finally, based on the translation practices of the book--Virginia Woolf, the author puts forwards the specific translation strategies of “self-” compounds and demonstrates and elaborates them with living examples as well. Therefore, it proves the effectiveness of the strategies, in hoping that it can be a reference to translation works with more extensive English compounds. ﹀
分类号：	H087/TP391
论文总页数：	160
参考文献总数：	29
馆藏号：	017/M2014(916)
公开日期：	2014-11-28

2014-05-31

利用语料库和网络资源解决英译汉的“难译词”问题研究——以马汉著作的汉译为例.徐征

链接

题名：	利用语料库和网络资源解决英译汉的“难译词”问题研究——以马汉著作的汉译为例
姓名：	徐征
学号：	1101210571
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-31
关键词：	语料库网络资源英译汉难译词马汉
外文关键词：	Corpus Internet English-Chinese Translation Difficult-to-translate Words Alfred Thayer Mahan
论文摘要：	︿笔者在翻译马汉的《海权与1812年战争的关系》过程中，有时遇到一些难以翻译的词，在查询词典中的义项后，仍然难以将其转换为目的语中的对应词，本文将此类词归类为“难译词”。词典的局限性是造成英译汉“难译词”问题的主要原因，主要表现在专业词汇的释义不够准确，有时甚至存在严重错误；忽略词汇的语义韵；两个词的释义区别不大，难以区分细微差异；新词没有被收录到词典中；词组收录不全，等等。因此，查词典并不能解决所有的疑难问题。以往的研究多从语义的角度单独分析近义词辨析、一词多义现象、语义韵等相关问题，很少将这些问题综合起来研究。语料库语言学研究很少从翻译的角度给出相应的对策；在基于语料库的翻译研究中，实证性研究，特别是英译汉的研究不足；对互联网作为翻译辅助工具的研究还不够深入。本文认为，解决英译汉“难译词”问题的途径有两个，即语料库和网络资源。本文以马汉《海权对1660-1783年历史的影响》和《海权与1812年战争的关系》中的“难译词”作为研究对象，并将《海权对1660-1783年历史的影响》的原著和三个译本建设成英汉一对三平行语料库，以方便检索三个译本中的译误，从而说明这些“难译词”给译者造成的困扰。在语料库研究方面，利用Sketch Engine工具自建专用语料库，通过语境共现猜测词义。在网络资源研究方面，利用Google高级搜索功能判断词义；通过Google搜索引擎和其他网络资源辨析易混淆词的词义；以Google的网页搜索和地图搜索功能相结合确定词义。本文通过语料库和网络资源的方法，纠正了词典中的一些错误释义，补充了词典中缺失的释义，指出了我国外交部网站中一部分词的不正确译法，从而有效解决了马汉作品中的“难译词”问题。﹀
外文摘要：	︿ In translating Sea Power in Its Relations to the War of 1812 by Alfred Thayer Mahan, the writer of this paper experienced difficulty in finding translation equivalents to some source language words and expressions from subentries listed in dictionaries. Such kind of words and expressions is termed as "difficult-to-translate words". This problem is caused by the inherent limitations of dictionaries. With respect to professional terms, for example, the subentries are sometimes inaccurate; the semantic prosody of a word is always ignored; there is sometimes little difference between the subentries of synonyms; some phrases and new words are not listed in a dictionary. Therefore, not all problems in translation can be solved by using dictionaries. Previous studies have tended to concentrate almost exclusively on synonym discrimination, polysemy or semantic prosody from the perspective of semantics. Little research has been done on these issues as a whole. In addition, corpus linguistics studies have given few translation approaches; there are few empirical studies conducted in corpus translation studies; further studies on the Internet as a translation tool remain to be carried out. In this paper, corpus and internet resources are identified as solutions to the problem of "difficult-to-translate words". In order to show the confusion caused by "difficult-to-translate words", some words in The Influence of Sea Power Upon History: 1660-1783 and Sea Power in Its Relations to the War of 1812 are presented as examples. To easily query translation errors, a bilingual parallel corpus is built, which contains one English source language and three Chinese translations of The Influence of Sea Power Upon History: 1660-1783. With respect to corpus studies, a special purposes corpus is built with Sketch Engine to enable KWIC display, which helps to guess the meaning of words. In internet resources studies, Google advanced search, Wikipedia, Google Maps and some other internet resources are combined to determine the meaning of some words. To sum up, some errors in dictionaries are identified and corrected; some subentries that are not listed in dictionaries are identified; some incorrect translations on the website of the Ministry of Foreign Affairs are found. As a conclusion, the problem of "difficult-to-translate words" can be solved with the help of corpus and internet resources. ﹀
分类号：	H059
论文总页数：	174
参考文献总数：	0
馆藏号：	017/M2014(116)
公开日期：	2014-05-31

英汉IT科普翻译之词汇层面翻译研究——以《傻瓜丛书：无线家庭联网》汉译为例.陈甜甜

链接

题名：	英汉IT科普翻译之词汇层面翻译研究——以《傻瓜丛书：无线家庭联网》汉译为例
姓名：	陈甜甜
学号：	1001210547
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工学硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-31
外文题名：	English-Chinese Translation Studies on Lexical Level of IT Popular Science Text-- A Case Study of Translation Practice of “Wireless Home Networking for Dummies”
关键词：	科普翻译技术词汇人称代词翻译策略
外文关键词：	popular science translation technical words personal pronouns translation strategies
论文摘要：	︿本文的研究对象是英汉IT科普翻译词汇层面的翻译问题；基于《无线家庭联网》一书的汉译实践，本文着重讨论了技术词汇的翻译问题和非技术词汇中人称代词的处理策略。由于原文文本在技术词汇和人称代词应用方面具有典型特征，因此针对此类翻译问题的讨论对相关翻译实践具有一定的启示作用。本文基于相关功能翻译理论分析了科普翻译的标准；结合原文文本，将科普文本的词汇划分为技术词汇和非技术词汇两类；概述了技术词汇的翻译问题及其一般处理策略，阐述了非技术词汇中有关人称代词的研究内容。经过上述研究分析，笔者获得了诸多有关科普文本翻译的启示，从而便于理论联系实际，更好地完成此次翻译项目的实际翻译过程。在技术词汇的翻译研究中，笔者将科普文本中的技术词汇细分为纯技术词汇和次技术词汇，基于词汇的来源分析探讨了科普翻译中的技术词汇的翻译问题，并结合翻译实例阐述了不同情形问题的处理策略。在人称代词的分析讨论中，笔者基于科普原文的人称代词使用特点，从翻译单位、英汉句法结构差异、科普文体风格等方面分析了科普文本中的人称代词翻译原则，并结合实例提出了相应情形下的处理策略。通过本文的讨论，希望可以为科普文章中相关翻译工作提供一定的借鉴。﹀
外文摘要：	︿ This paper studies the problems of word translation in English-Chinese translation of popular science books. Based on translation of Wireless Home Networking for Dummies, the paper focuses on the discussion of translation strategies of technical words and personal pronouns. The original text has typical wording characteristics in using technical words and personal pronouns, so the study of such issues is absolutely necessary and significant. Based on the relevant theories of functional translation studies, the paper analyzes translation criteria of popular science texts. Combining with the original text, the words of popular science texts are divided into two categories: technical and non-technical words. The paper provides an overview of technical words and their general translation strategies as well as the research related to personal pronouns. With the above analysis, the author tries to complete the translation project through integrating theory with practice. In this paper, the technical words of the text are subdivided into two subcategories: the pure technical terms and sub-technical terms. Based on a brief analysis on the sources of technical words, this paper discusses their translation strategies for different situations. Combined with the wording characteristics of the original text, the paper analyzes the strategies used in the process of translating personal pronouns of the popular science text. With respect to the translation of personal pronouns, the analysis involves translation unit, English and Chinese syntactic structures and the style of the original text. ﹀
分类号：	H087/TP391
论文总页数：	167
参考文献总数：	26
馆藏号：	017/M2014(777)
公开日期：	2014-05-31

省力原则指导下的显化翻译研究——以科研文献的翻译为例.马千里

链接

题名：	省力原则指导下的显化翻译研究——以科研文献的翻译为例
姓名：	马千里
学号：	1001210769
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工学硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-31
关键词：	省力原则翻译科研文献
论文摘要：	︿本次翻译实践所选书籍《表情的解析》（What the Face Reveals，2005）由著名心理学家保罗·艾克曼等人主编，是一部关于表情研究的科研文集。该书以大量的统计数据为基础，结合了不同的实验方法和具体的应用成果，对使用面部测量技术可得到的丰富信息进行了深入的探讨。正如该书中所述，耗时繁重的面部测量操作限制了新研究可被完成的速度，科研人员阅读科研文献译文时在语言方面所耗费的大量时间和精力也必然会限制他们的研究速度，而省力原则的核心在于“用最小的代价换取最大的收益”，将省力原则应用到科研文献翻译中，通过生成易于理解的“省力”译文，可为科研人员赢得更多的时间和精力去完成更多的科学研究。另一方面，虽然科研文献的原文通常都遵循了结构清晰、逻辑清楚等要求，但是由于英汉语言之间的表达差异和译者翻译策略的选择不同，译文常常不能再现原文的“省力”阅读效果，或常常存在进一步“省力”的空间，因而我们需要、并可以通过有意识地应用省力原则来改进科研文献译文的阅读效果。以往将省力原则与翻译相结合的研究，大多关注的是文本经济性，而本文针对“省力”阅读效果对于科研文献翻译以及科学研究的特殊意义，将研究主要关注点放在了如何生成具有“省力”阅读效果的译文上。本文从翻译信息论和认知框架理论角度考虑了可能影响“省力”效果的因素，并在此基础上，结合此次翻译实践的具体情况，提出了三条可改进译文“省力”阅读效果的显化翻译策略，即“增加结构提示”、“转化逻辑关系”和“系统权衡术语”。这些策略以“省力”为中心，对省力原则在翻译中的实践应用进行了检验和拓展。﹀
分类号：	H059/H315.9
论文总页数：	189
参考文献总数：	0
馆藏号：	017/M2014(217)
公开日期：	2014-05-31

英语技术文档中动词文体特征的研究.安妮

链接

题名：	英语技术文档中动词文体特征的研究
姓名：	安妮
学号：	1101210572
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-31
外文题名：	A Study of Stylistic Features of Verbs in Technical Documentation
关键词：	英语技术文档动词文体特征语料库功能文体学
外文关键词：	Technical Writing Verbs Stylistic Features Corpus Functional Stylistics
论文摘要：	︿技术文档是企业全球化过程的产物,旨在为企业提供内部和外部的技术交流。技术文档写作在国际上发展较早，目前已趋向成熟，而该领域在我国发展得较为缓慢。作为全球通用的商务语言，英语成为了技术文档的主要输出语言。随着企业全球化战略的推广和外包服务的发展，越来越多母语非英语的写作人员加入到英语技术文档写作这一领域。然而，有些写作人员在实际的英语技术文档写作过程中缺少对语篇的宏观把握，没有形成良好的语言运用的思维模式，导致市面上许多英语技术文档的质量良莠不齐，影响文档的实用性，不利于企业的全球化发展。鉴于此，本文从语言研究的角度出发，以技术文档中动词的文体特征为研究对象。借鉴功能文体学的研究方法，按照文体分析的程序，首先将分析的出发点建立在技术文档语篇上。然后分别对动词的时态、语气结构、情态和语态四个方面进行定性分析和基于语料库的定量分析。其中，语料参照库为布朗语料库家族中的ClOB和Crown，同时根据manualslib资源构建了基于20个知名欧美品牌技术文档的技术文档语料库。论证中以动词不同形式作为变量分类，并结合通配符与词性赋码在语料库中检索与统计。针对基数不同的问题，使用差异显著性验证的方法来进行求证。最后在实际语料中针对数理统计结果所呈现的突出特征加以解释，总结出技术文档中动词相关形式的文体特征。旨在使母语非英语的写作人员在技术文档中动词形式的使用上有明晰和规范的认识，希望从语言层面上对英语技术文档写作有所启发。﹀
外文摘要：	︿ As the product of globalization strategy, technical documentation aims to provide internal and external technical communication for enterprises. Technical writing developed early in the international community and has become increasingly mature. In China, however, the progress of technical writing is still slow. As the common business-oriented language, English has become the main written language for technical documentation. With the promotion of globalization strategy and development of outsourcing services, more and more non-native English writers engage in technical writing in English. However, many of them lack the general knowledge of the discourse as a whole and they do not form a good thinking mode of language use in practice. This leads to uneven quality of the technical documentation published in the market, which may adversely affect the function of the documentation and development of enterprises’ globalization. Given this, from the perspective of linguistic study, this paper is intended to investigate the stylistic features of verbs in technical documentation. According to the analysis procedure of functional stylistics, first, this paper is based on the technical documentation discourse. Second, the research method of functional stylistics is used to conduct qualitative analysis and corpus-based quantitative analysis of the tense, tone structure, mood and voice of verbs. CLOB and Crown from the family of Brwon corpus are used as reference corpus and there is a self-built corpus for contrastive analysis, consisting of 20 Euro-American brands of technical documentations from the Manualslib resources. The verb forms are classified as variable and a combination of wildcard characters and tagsets is used for retrieval and statistics in the corpora during the analysis process. Since the two corpora differ in size, Chi-square test is applied for verification. With the texts from the corpus, this paper then explains the prominent features shown in the mathematical statistic to summarize the stylistic features of verbs in technical documentation. This paper aims to provide non-native English writers with a more explicit and normative understanding of handling verb forms in technical writing and give enlightenment on the writing at the language level. ﹀
分类号：	H059/H052
论文总页数：	152
参考文献总数：	43
馆藏号：	017/M2014(521)
公开日期：	2014-05-31

文献角度下看汉学的写作和翻译——以The Last Empress: The She-Dragon of China译本为例.李卓勋

链接

题名：	文献角度下看汉学的写作和翻译——以The Last Empress: The She-Dragon of China译本为例
姓名：	李卓勋
学号：	1101210775
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	北京大学
导师2姓名：	李博婷
导师2单位：	北京大学
论文答辩日期：	2014-05-31
关键词：	汉学文献分析翻译和写作
论文摘要：	︿自20世纪90年代以来，回译汉学著作、引进海外汉学研究，已经发展成为一种风潮，并被国内学者称赞为“文化的馈赠”。笔者翻译实践的过程就是回译由Keith Laidler所著的汉学作品—《最后的女皇：中国的女龙》（The Last Empress: The She-Dragon of China）一书。从他者视角重新认识中国形象可以激发思考，拓宽国内研究视野，丰富研究思路和方法。但也有学者指出，要警惕海外汉学的话语霸权和颠覆性。可见，对待海外汉学光有热情是不够的，还应保持头脑清醒，客观地分析对待。本文研究的课题是从翻译作品的参考文献入手，考查此部汉学作品的写作和翻译情况。值得注意的是，这里的翻译具体是指原书作者在引用中国材料时，从中文到英文的翻译。研究报告的基本思路是，首先搜集分析全书文献引用的基本情况，然后重点研究其中十本主题文献，论证该部汉学作品的学术性、时代性和独创性。翻译部分则是先回译英译的中文材料，而后反观作者的英译效果。本次翻译报告共包括六个章节。第一章，概述了此次翻译报告的研究背景、及本课题拟解决的关键问题，并探讨了研究思路和意义；第二章，文献综述，重点论述了文献分析对研究汉学作品的重要性；第三章，介绍了翻译作品的作者和书评；第四章，以文献分析数据为基础，考查翻译作品的写作情况；第五章，考察作者对中文材料的英译情况；第六章，总结全文观点，提出辩证看待海外汉学作品。﹀
分类号：	H087/TP391
论文总页数：	41
参考文献总数：	25
馆藏号：	017/M2014(697)
公开日期：	2014-05-31

论主位推进与语篇的衔接和连贯及翻译策略——以《职业健康科学：压力、精神生物学与工作新天地》为例.仲婕

链接

题名：	论主位推进与语篇的衔接和连贯及翻译策略——以《职业健康科学：压力、精神生物学与工作新天地》为例
姓名：	仲婕
学号：	1101211126
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-31
关键词：	主述位理论主位推进衔接和连贯翻译
论文摘要：	︿英语语篇注重形合法，主述位理论是英语语篇衔接和连贯的一个重要方法。该理论最早是由布拉格学派的创始人马泰休斯提出来，经过几十年的发展，成为系统功能语言学的一个重要理论支柱，是语篇分析的一个常用手段。本文在综合分析国内外的主述位理论、主位推进模式研究现状的基础上，汲取其结构划分的优点，克服前人理论存在的模式界限不够清晰、整个段落结构划分不直观的共性问题，为了有效地进行篇章结构的划分，更加直观有效地表达各种推进类型，发展总结出四种主位推进模式：竖线模式、斜线模式、相交直线模式和组合模式，并得出直观有效的主位推进模式结构图的构图方法，并应用主位推进模式进行语句和语篇的结构切分、主位提取和结构图绘制，并结合翻译实践进行翻译策略的研究。在上述理论研究的基础上，本文从《职业健康科学：压力、精神生物学与工作新天地》翻译的实践出发，综合应用四种主位推进模式研究其对语篇衔接与连贯的意义；充分结合该理论，论述在何种情况下保留语篇的原主位结构的翻译策略，及在无法完整保留原语篇或语句的主位结构的情况下的翻译策略，以达到实现译文语篇的衔接与连贯的目的。﹀
分类号：	H087/TP391
论文总页数：	179
参考文献总数：	0
馆藏号：	017/M2014(725)
公开日期：	2014-05-31

电影研究中术语的翻译策略—以《银幕上的中国：电影与民族》为例.赵雪艳

链接

题名：	电影研究中术语的翻译策略—以《银幕上的中国：电影与民族》为例
姓名：	赵雪艳
学号：	1101211112
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-31
外文题名：	Translation Strategies for Cinematic Terms-A Case Study of China On Screen: Nation and Cinema
关键词：	电影研究术语术语翻译翻译策略
外文关键词：	Cinema Studies Term Term Translation Translation Strategies
论文摘要：	︿本翻译研究译本《银幕上的中国：电影与民族》是由电影研究学者克里斯·贝利和玛丽·法夸克合著完成，并于2006年出版问世。著者通过对已有理论成果的批判性借鉴和典型影片的深入分析，提倡以全新的民族电影研究范式重新审视中国电影和民族之间的关系。书中出现了很多西方电影研究术语，然而由于中西语言、文化以及电影现象之间的差异，很多术语并不能直接“嫁接”于中国电影研究中，从而构成了翻译过程中的一大难点。本文基于China on Screen: Cinema and Nation 的翻译体验，针对电影研究中术语的处理提出两大翻译原则：尊重著者的学术立场和置于语境之中。在上述原则的指导下，笔者将术语分为规范术语、争议术语和新兴术语三类，同时考虑到电影研究的跨学科特性，将学科间借用术语作为单独分类进行讨论。针对上述四种类型提出相应的翻译策略，即参照标准、比较与选择、合理创造和灵活处理。通过使用翻译实践的实例对策略进行阐述，说明策略的有效性。最后，本文对书中术语进行详细地梳理，汇总出四个分类的术语表，以期为同类书籍的翻译提供借鉴。鉴于翻译书籍的学术性质，本文在对翻译实例的分析过程中，尤其注意对该领域内专业知识的详细说明，旨在增强译者的专业背景知识，为加强电影研究领域的中西学术交流做出贡献。﹀
外文摘要：	︿ This translation project is based on China on Screen: Cinema and Nation, co-authored by Chris Berry and Mary Farquhar and published in 2006. The book argues for taking a new approach of national cinema to reexamine the important relationship between Chinese cinema and the nation. As an academic study, this book abounds in Western cinematic terms in its exploration of cinematic works. However, owing to the differences between Chinese and Western languages, culture and cinema, many terms cannot be directly “applied” to Chinese cinema and therefore pose a great challenge to the translation of this book into Chinese. This paper first proposes two principles of translating these terms with an emphasis on the author’s academic position and the context. Next, the thesis divides cinematic terms into four categories: standard terms, controversial terms, new terms and interdisciplinary terms given the interdisciplinary nature of cinematic studies. Then it goes on to put forward by examples specific translation strategies, such as conforming to the national standards, comparing and choosing, making reasonable inventions and flexible adaptations. The thesis also summarizes a list of terms of each category with detailed annotations. Given the academic nature of the translated text, the present thesis pays particular attention to supplying professional information about cinema in its analysis of terms for the enhancement of the translators’ expertise. The original contribution of this paper lies in that it offers strategies for the translation of interdisciplinary terms as well as supplying elaborate term lists for both readers and translators of cinematic texts. ﹀
分类号：	H059/H315.9
论文总页数：	193
参考文献总数：	40
馆藏号：	017/M2014(780)
公开日期：	2014-05-31

幽默语气的翻译策略——以《匆匆》为例.刘琼

链接

题名：	幽默语气的翻译策略——以《匆匆》为例
姓名：	刘琼
学号：	1101210806
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-31
外文题名：	Translation Strategies of Humorous Tone-Take Rush as an Example
关键词：	幽默翻译策略《匆匆》
外文关键词：	humor translation strategies Rush
论文摘要：	︿翻译项目基于《匆匆》一书，作者托德·布什霍兹，该书出版于2012年，阐释了现代社会如何通过竞争获得幸福。《匆匆》一书反映了美式幽默的语言和文化特色。如何将幽默的微妙寓意及诙谐效果再现到译文中，正是本文要探讨的内容。幽默具有文化特殊性，中美幽默存在巨大差异。本文基于幽默研究和幽默翻译研究的成果以及《匆匆》一书的幽默翻译实践，确定了本书的两大幽默翻译原则，包括确定幽默目标、避免幽默禁忌。根据《匆匆》一书的言语幽默翻译实例，本文归纳了书中具有代表性的三类幽默目标，包括调侃政客、调侃宗教、调侃读者，分析了幽默目标选择背后的政治、文化因素，在此基础上进一步探讨了美式幽默的借鉴性以及对中式幽默的反思，具体体现在幽默性格、幽默目标、幽默内容和幽默场合四个方面。立足于中美幽默的异同，本文归纳总结出四种幽默翻译策略，包括直译法、补充翻译法（文内补充翻译和加注翻译）、方言翻译法（地域方言翻译法和网络流行语翻译法）、成语翻译法（成语叠加押韵法和成语改编法）。在进行幽默翻译时，译文的内容与形式既要尽可能贴近原文，又要灵活取舍原文的幽默元素，尽量让译文读者能够理解字里行间的幽默意蕴，体会到幽默的美学魅力。﹀
外文摘要：	︿ This translation project is based on Rush written by Todd G. Buchholz and published in 2012. This book proposes that happiness is obtained in modern society through competition. As Rush well reflects the linguistic and cultural features of American humor, this thesis aims to discuss ways of conveying the subtle implications and humorous effects to the target text. Humor is culture specific. Chinese humor and American humor have substantial differences. Based on the research of humor and humor translation as well as the translation practice of Rush, this thesis identifies two principles of humor translation, namely determining targets of humor and avoiding taboos of humor. On the basis of verbal humor translation in Rush, this thesis summarizes three targets of American humor, namely, politicians, religions and readers, due to political and cultural reasons. Then it elaborates on the significance of American humor and the introspection of Chinese humor from the aspects of humorous character, target, content and situation. In view of the similarities and differences between Chinese humor and American humor, this thesis sums up four strategies of humor translation, including literal translation, translation with supplement (translation with in-text supplement and annotation), translating into dialect (translating into territorial dialect and translating into network catchword), translating into idiom (translating into rhymed idiom reduplication and translating into adapted idiom). During the process of translating humor, target text should not only reproduce the closest equivalence of source text both in content and form but should also select original humorous elements flexibly. The translation should aim to allow readers to understand the wit of humor and feel its aesthetic charm. ﹀
分类号：	H059/H315.9
论文总页数：	160
参考文献总数：	40
馆藏号：	017/M2014(63)
公开日期：	2014-05-31

跨文化传播视角下基于读者因素的译文详略处理研究——Chinese Business Etiquette and Culture 翻译报告.李响

链接

题名：	跨文化传播视角下基于读者因素的译文详略处理研究——Chinese Business Etiquette and Culture 翻译报告
姓名：	李响
学号：	1101210762
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-31
外文题名：	Study on Amplification and Omission Translation Strategies based on Reader Factors from the perspective of cross-cultural communication-- Translation Report of Chinese Business Etiquette and Culture
关键词：	读者因素跨文化传播译文详略处理
外文关键词：	Reading Motivation Cross-cultural Communication Translation Strategies of Amplification and Omission
论文摘要：	︿随着经济全球化的不断发展和中国国际地位的不断提高，越来越多中国题材的英文书籍涌现出来，这一本本书就像是一扇扇窗口，透过它们外国人希望更加透彻全面地了解中国。笔者此次翻译的Chinese Business Etiquette and Culture就是这样一本书，原作者为了帮助让西方读者更好地与中国机构以及中国人打交道，向他们详细介绍了中国的商务礼仪及文化，并提出了实用性较强的行为策略。笔者在翻译本书时，认为译文的主要目标群体是一些需要和外国人打交道的中国商务人士，其主要阅读动机并非了解中国商务礼仪和文化，而是希望站在外籍商务人士的角度，了解他们是如何看待中国商务礼仪及文化的，从而更好地与外籍商务人士打交道。本文从这样的传播目的出发，探讨了如何对译文进行详略处理，使之更加符合读者的阅读需求。首先，在翻译工作开始之前通过问卷调查粗略定位读者需求，根据读者需求建立本次翻译工作的标准。然后，在翻译标准的基础上，对需要信息省略处理和补充处理的内容进一步分类，细化翻译策略，指导翻译工作。翻译工作完成后，将翻译标准中读者可以感知的标准作为外部评估标准，并以此设计问卷，通过读者反馈对翻译策略进行评估，最终确定翻译策略的有效性。通过研究，笔者认为本书译文中需要进行信息省略处理的内容分为五类：（1）关于中国语言和特有事物的释义性内容；（2）关于中国历史背景和文化传统的细节描述；（3）与译文传播目的关系不大的带有政治色彩的词汇；（4）针对西方读者的基础技能指导；（5）常识性的文化误读。译文中需要进行补充的内容分为四个方面：（1）译文读者感到陌生的具体事物和机构组织的介绍；（2）西方文化背景及社交礼仪；（3）原作者意图的表达；（4）针对译文读者的实用性建议。研究表明经过以上几个方面的详略处理，译文更满足读者的阅读需求，更符合传播的目的。﹀
外文摘要：	︿ With the development of economic globalization and the entrenchment of China’s international status, a growing number of English language books about China are published, through which westerners can get a better understanding of China. The purpose of the author of Chinese Business Etiquette and Culture was to introduce Chinese business etiquette and culture and provide some practical behavior strategies to westerners, which helps them better deal with Chinese institutions and people. The target audience of the translation of this book is Chinese business people doing business with westerners. To understand Chinese business etiquette and culture is not their motivation to read this book. Instead, they want to know how westerners understand Chinese business etiquette and culture so that they can better deal with western business people. To this end, this paper explored the translation strategies of amplification and omission according to the reading motivation of Chinese readers. First of all, this paper analyzed the demand of readers through a questionnaire survey before starting the translation, and established translation criteria of this book according to readers’ demand. Secondly, based on the translation criteria, the translation strategies of amplification and omission were defined by contents of the book. After the translation was completed, four standards which can be perceived by readers were used as the evaluation criteria. According to the criteria, another questionnaire was designed to gather the feedback of readers and evaluate the effectiveness of the translation strategies of omission and amplification. Omissions in translation are divided into five categories: (1) Paraphrase of Chinese language and things unique in China; (2) Detailed introduction of Chinese history and traditional culture; (3) Political vocabulary irrelevant to the purpose of the translation; (4) Guidance on basic communication skills for Western readers; (5) Cultural misunderstanding of common sense. Amplifications in translation are divided into four categories: (1) Things and organizations unfamiliar to Chinese readers; (2) Western culture and social etiquette; (3) Intention of the original author; (4) Practical advice to Chinese readers. In this paper, it is proved that the translation strategies of amplification and omission discussed above can meet the demand of readers and achieve the purpose of communication. ﹀
分类号：	H087/TP391
论文总页数：	182
参考文献总数：	0
馆藏号：	017/M2014(102)
公开日期：	2014-05-31

翻译中的欧化现象及其可接受度的实证研究——Breaking Free的翻译实践报告.王汉江

链接

题名：	翻译中的欧化现象及其可接受度的实证研究——Breaking Free的翻译实践报告
姓名：	王汉江
学号：	1101210923
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-31
外文题名：	A Case Study on the Acceptance of Europeanization in E-C Translation: A Report on the Translation of Breaking Free
关键词：	欧化翻译语言接触适度欧化接受度
外文关键词：	Europeanization Language Contact Moderate Europeanization Acceptance
论文摘要：	︿近年来，英语的大量使用与研究对现代汉语带来了较为显著的“欧化”现象，而现代语言学者对中文“欧化”现象的态度也是褒贬不一。但站在客观的角度上分析，虽然有些“欧化”现象对汉语的健康和发展产生了一定的不良影响，但也有很多“欧化”现象丰富与完善了汉语体系。而汉语欧化产生的重要原因之一是英汉翻译，所以，作为英语的学习者，大多数中国译员在翻译的过程中难免会受到“欧化”思维定势的影响。因此，正视“欧化”问题不仅对翻译事业意义重大，也对更好地理解现代汉语的变化大有裨益。汉语从文言文到白话文，再到现代汉语的演进过程，欧化现象无疑为汉语提供了无数的素材与改进。消极欧化向积极欧化的改进也对翻译事业的丰富与发展做出了不小的贡献。本研究以笔者所译《脱身》一书为语料，根据笔者对“欧化”现象的认知与研究，探讨并总结了“欧化”翻译现象在这本书中不同风格语篇下的产生与改进，试图管窥消极欧化所带来的弊端。只有正视翻译中欧化的消极因素，才能减少其对汉语的负面影响，才能更好地接受积极欧化带给汉语的营养。基于此，笔者通过对《脱身》一书的翻译实践，在本研究中提出了“适度欧化”这一概念。并以实际翻译为例，分析“适度欧化”在翻译过程中如何实现。最后，笔者制作了问卷，通过多维度考察公众对“消极欧化”和“积极欧化”现象的接受程度。通过对问卷调查的结果运用统计学方法进行研究，来部分地反映出实际现象，从而为理论的提出提供依据和进一步分析的空间。﹀
外文摘要：	︿ In recent years, the widespread uses of English and plentiful researches on English language have led to a series of prominent changes, such as Europeanization, in Modern Chinese. Modern linguists hold different opinions on Europeanization. Although having a negative impact on Chinese language, Europeanization to some extent also contributes to the enrichment and perfection of it. E-C translation is one of the main causes of Europeanization. As English learners, Chinese translators may suffer the Europeanization thinking pattern during translation process. Therefore, a penetration to the research of Europeanization is of great importance and significance, not only for the translation study but also for a better understanding of the current development of Chinese language. Chinese language has grown from classical to vernacular Chinese, and eventually to modern Chinese. During this course, Europeanization undoubtedly provides Chinese language with numerous material and improvements. In the meantime, the transition from negative to positive Europeanization also enriches the development and prosperity of translation study. In this study, using the translation of Breaking Free as research subject and based on my knowledge of Europeanization, I discuss and summarize a number of Europeanization facts under different styles of discourses in this book and put forward some relative translation improvements. All of these researches are intended to catch a glimpse of negative Europeanization evils. Only by envisaging the negative factors of Europeanization, can we then reduce its negative impact on the Chinese language and better accept benefits brought by Europeanization. Based on translation process of Breaking Free, the author in this study proposes a concept of “Moderate Europeanization”, and discusses how it can be achieved during the translation process. Finally, I conduct a questionnaire survey to study the public’s acceptance of “negative Europeanization” and “positive Europeanization” from multi-dimensional perspectives. Statistical analysis of the questionnaire results partly reflects the current state of the public’s acceptance, so as to provide the additional evidence and further analysis of the “Moderate Europeanization” concept. ﹀
分类号：	H315.9
论文总页数：	178
参考文献总数：	35
馆藏号：	017/M2014(118)
公开日期：	2014-05-31

基于语料库的大学生译作的欧化翻译研究.刘倩

链接

题名：	基于语料库的大学生译作的欧化翻译研究
姓名：	刘倩
学号：	1101210805
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-31
外文题名：	A Corpus-based Investigation of College Students’ Europeanized Translation
关键词：	欧化翻译语料库大学生译作
外文关键词：	Europeanization Translation Corpus Translated Text
论文摘要：	︿对于欧化翻译研究，学者王力、王克非等都进行过深入的探讨，但前人对欧化翻译的研究大多为解释性的。近几年来，随着语料库的蓬勃发展，越来越多的研究者开始运用语料库进行翻译研究，使欧化翻译也由最初的解释性步入描写性研究的新时期。本文以大学生译文语料库为主，原创性汉语语料库为辅的形式对大学生译者的欧化翻译进行研究。意在解决三个问题：其一，当前大学生翻译作品中的欧化现象有哪些？发展趋势是怎样的？其二，大学生译作中欧化翻译现象表现形式的具体分类？其三，针对出现的欧化翻译现象，大学生译者在今后的翻译过程中该如何趋利避害？针对上述问题，本文提出了相应的“研究方法”，即通过自建语料库的形式，对大学生欧化翻译现象进行检索分析。对于实际问题的解决则主要从标点、词汇和句子三大层面进行探讨，并发现大学生译作中欧化翻译现象主要表现在逗号过度使用、字母词“零翻译”及长句法结构使用过多三个方面。针对这些现象，笔者进一步提出了大学生译者在翻译过程中应辩证对待欧化翻译，采取恰当的翻译策略。本文的创新点在于使用了语料库与翻译实践相结合的方法，且主要研究对象是大学生译者。此外，在欧化表现类型分类方面，本文首次将标点符号纳入欧化翻译的分类当中，这是之前所有的欧化研究都没有提出过的。﹀
外文摘要：	︿ Many researchers have discussed “europeanization”, but most of their studies are still at the expositive research level. However, with continuous developments of corpus technology, corpus-based translation study is getting much more popular in resent years. More and more researchers get down to “europeanized translation” study based on corpus, which doubtlessly brings fresh air to this field and achieves the combination of descriptive research and expositive research. This paper studies the europeaniztion of college students’ translated texts. Besides establishing students’ translated text corpus, this paper also takes several other original Chinese corpora into account, such as the UCLA Corpus of written Chinese, Lancaster Corpus of Mandarin Chinese, etc. This paper is mainly to solve three issues: firstly, what are the main europeanized phenomena in current college students' translation work? Secondly, how to classify these europeanized phenomena? Thirdly, how should students treat europeanization in the process of transltion? According to these three issues, the author established a corpus based on Chinese English learners’ translated texts in order to answer these questions. Then it discusses europeanized phenomena from point of punctuation, words and sentence level, among which the punctuation level is put forward for the first time. By corpus analysis, the author finds that the phenomena of Europeanized translation in college students’ translated texts lie in three aspects including the overuse of comma and long syntactic structure, and "zero translation" of lettered-words. Besides, the object of this study is only focus on college students, since they can most likely become the translators in the near furture. ﹀
分类号：	H059/H315.9
论文总页数：	159
参考文献总数：	80
馆藏号：	017/M2014(129)
公开日期：	2014-05-31

中式英语在词项搭配层面的表现探析.张涵

链接

题名：	中式英语在词项搭配层面的表现探析
姓名：	张涵
学号：	1101211054
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	中国人民大学外国语学院
论文答辩日期：	2014-05-31
外文题名：	Preliminary Studies on Chinglish Collocations
关键词：	中式英语词项搭配偏误对比
外文关键词：	Chinglish Collocation Error Contrast
论文摘要：	︿中式英语指带有中文语音、语法、词汇特色的英语，是一种洋泾浜语言，是英语在中国的“本地化”，在语音语调、词汇、语法、语篇层面都有不同的表现。中式英语在以汉语为第一语言、英语为第二语言的学习者的口语和书面语中经常出现，按照结构主义派的见解，这是母语习惯干扰目的语习惯的结果；按照转换生成语言学派的说法，是母语能力 (L1 competence) 和目的语能力 (L2 competence) 彼此冲突。可以认为，两种语言的对比分析 (contrastive analysis) 有助于学习者更好地了解目的语和母语之间的共同点和差异，以规避可能产生的偏误；而偏误分析 (error analysis) 通过中介语和目的语的对比，使用语言学的原则和方法，可系统地识别、描述并系统阐释学习者所犯偏误，比对比分析更具针对性，对于第二语言教学的实用价值更大。鉴于词项搭配对语言习得意义重大，近三十年来，语言学界的诸多学者越来越多地关注词项搭配这种语言现象。有研究表明，人脑对惯用语串 (formulaic sequence) 的处理在速度和精确度上明显高于非惯用语串 (non-formulaic sequence). 对于词项搭配的掌握情况关系到语言学习者语言使用的准确性和流利度，也关系到语言教学的效果。本研究对《中国学生英语笔语语料库》(WECCL 1.0) 中抽取的229余条语料里的词项搭配偏误进行分析和归类，借鉴对比分析、错误分析和论元结构等理论，不仅从语言形式上探究中英两种语言的差异对学习者造成的影响，也从不同民族思维方式的差异来探究这些偏误产生的深层原因。本研究具有较高的理论和实践价值。在理论层面，通过大量英语中正确词项搭配和中式英语中相应的有误词项搭配的对比分析，可以看出英汉两种语言词项之间结构和意义关系的差异，研究趋于深化，最终有助于揭示词汇结合在人类认知系统中如何组织等问题。在实践层面，对于中式英语在词项搭配层面的表现的归类分析，有助于帮助英语学习者有意识地规避搭配偏误，对于英语作为第二语言的教学、尤其是词汇教学而言，也是一个有力的参考。﹀
外文摘要：	︿ Chinglish, an interlanguage between Chinese and English, appears in the form of English influenced by Chinese. It is found in practically every aspect of the language system, including pronunciation, intonation, syntax and semantics. According to structural linguists, such interlanguages as Chinglish are the result of interference of the first language in the second. Likewise, some transformational-generative theorists observe that an interlanguage shows the conflict between L1 competence and L2 competence. Given what's discussed above, it is believed that a contrastive analysis helps the learner better understand similarities and differences between his/her mother tongue and target language and thus avoid possible errors in his/her use of target language. Different from contrastive analysis, an error analysis is more specific in identifying and explaining errors on a systematic basis, which may cast light on second language teaching and acquisition. In the last three decades, the term "collocation" has drawn more attention from linguists for its significance in second language acquisition. According to some, the human brain processes formulaic sequences faster than non-formulaic ones. The accuracy and fluency in a learner's use of the target language is largely affected by his/her knowledge of formulaic sequences, namely, collocations. In composing this thesis, I adopt theories about contrastive analysis, error analysis, and argument when exploring the root causes of learners' collocational errors found in the 229 items from essays of Written English Corpus of Chinese Learners. The thesis is of theoretical and practical value. Theoretically, a contrastive analysis of erroneous English collocations in Chinglish and the correct ones in English offers the possibility of discerning the syntactic and semantic differences between the English and Chinese languages. Further research to this end may help explain how collocations form in human cognition. From a practical perspective, the analysis and classification of collocational errors in Chinglish may help learners find out the root causes of their errors, which can be reduced and even avoided if remedied timely. Research in this field also helps English teaching and acquisition, especially at the lexical level. ﹀
分类号：	H313
论文总页数：	100
参考文献总数：	46
馆藏号：	017/M2014(710)
公开日期：	2014-05-31

开放课程特点及对应翻译策略研究——以斯坦福大学《iOS应用开发》为例.崔梦婕

链接

题名：	开放课程特点及对应翻译策略研究——以斯坦福大学《iOS应用开发》为例
姓名：	崔梦婕
学号：	1101210608
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	柏晓静
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-31
关键词：	开放课程字幕翻译策略
论文摘要：	︿开放课程是目前全球教育领域的一个趋势，开放课程字幕作为其内容的重要组成部分，在促进全球教育资源的知识获取中起到了重要的作用，具有大量的翻译实践需求。目前，国内外的相关研究主要集中在开放课程的实际应用、影视字幕研究的理论和实践等方面，关于开放课程翻译方面的系统研究较少。国内对于开放课程字幕翻译这一领域的研究方法与理论仍不太明确，缺乏系统的指导理论和规范的翻译模式，值得我们进一步进行研究。本文通过将开放课程与影视剧、教科书类出版物和传统课堂进行对比，与翻译实践相结合，总结出开放课程翻译的特点。开放课程翻译具有翻译对象、时间空间、复杂性以及翻译模式和标准方面的多重限制性，同时，由于其受众和目标的明确性、原文不确定性和信息补充需求，开放课程翻译还具备一定的自由性。在翻译开放课程时，要考虑开放课程翻译的特点。开放课程翻译的关键是理解开放课程的各种限制，利用自由性，消除观看者对于源语言的理解障碍，高效地传递课程内容，帮助观看者更好地理解课程内容。开放课程的翻译实践并不是单纯的一对一翻译过程，而是内容的整体转化，一切翻译实践均以目标受众的正确理解和知识的准确传达为基础，在转换过程中需要遵守这个原则，进行一定程度上的取舍和增补。在此基础上，本文提出了合理增译、省略不译和改写转换等具体的翻译策略，同时以译例加以解释和证明。﹀
分类号：	H315.9
论文总页数：	233
参考文献总数：	38
馆藏号：	017/M2014(542)
公开日期：	2014-05-31

2014-05-30

生态翻译视角下交叉学科科普文本中隐喻的汉译策略研究——以《真正的环境危机》为例.鹿桐欣

链接

题名：	生态翻译视角下交叉学科科普文本中隐喻的汉译策略研究——以《真正的环境危机》为例
姓名：	鹿桐欣
学号：	1101210827
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
外文题名：	On E-C Translation Strategies of Metaphor in Interdisciplinary Popular Science Text from the Eco-translatology Perspective-Take the "Real Environmental Crisis" as an Example
关键词：	交叉学科科普隐喻生态翻译策略
外文关键词：	Interdiscipline Popular science Metaphor Eco-translatology Strategy
论文摘要：	︿交叉学科作为一种新兴的学科群，正越来越受到国内外学者的重视。而这种科学知识的传播与普及也需要科普著作来实现。交叉学科科普文本给读者提供了从不同学科看问题的视角。该类文本同属科普题材，集科学性和文学性于一体。其突出的特点之一便是大量隐喻的使用。这些隐喻的使用复杂多样、灵活多变，在交叉学科的背景下体现了一定的特殊作用，对策略的选择和读者的认知产生了很大影响。隐喻是语言使用中的一种普遍现象，长期以来都被视为一种修辞手段。对隐喻的研究最早要追溯到2000多年前的亚里士多德。随着语言学的不断发展，1980年Lackoff & Johnson在《我们赖以生存的隐喻》一书中首次提出了概念隐喻理论，开辟了认知隐喻研究的崭新历程。胡庚申教授在2004年发表的《翻译适应选择论》一书中提出了以“译者为中心”的翻译理念，指出译者在翻译过程中要对翻译生态环境做出选择性适应与适应性选择，并从语言维、文化维和交际维等多维度进行转换。这一理论对翻译及隐喻翻译研究有很强的指导意义。本文在生态翻译学的翻译适应选择论指导下，以《真正的环境危机》一书中隐喻的翻译为例，重点探讨了交叉学科科普文本中隐喻的翻译策略。首先，文章简要概述了隐喻的概念，介绍了生态翻译学理论及其应用。其次，笔者从译本出发，将交叉学科语境下的隐喻表现类型概括为四类，通过实例进行了详细阐释。在此基础上，本文对不同类型隐喻的作用进行了分析，并总结了隐喻的这些作用对翻译策略选择产生的影响。再次，本文探讨了交叉学科科普文本隐喻的“翻译生态环境”，提出了具体的翻译原则和五种翻译方法：保留原隐喻；保留原隐喻，并加以注释或说明；适当转换隐喻形象；对隐喻进行引申；直接意译。文章最后总结了交叉学科科普文本中影响隐喻翻译策略选择的各种因素，包括文化因素、政治因素、上下文语境等，指出了可能的研究意义和需要改进的地方。﹀
外文摘要：	︿ As an emerging group of subjects, interdisciplines have caught much more attentions at home and abroad, the knowledge of which needs to be propagated and popularized through popular science discourses and articles. It provides readers with views on one issue from a perspective of different disciplines. As a special type of popular science discourse, it is a combination of science and literature, with plenty of metaphors as its symbol. These metaphors are complex and varied, embodying some special effects under the interdisciplinary context, which have an impact on the selection of translation approaches and the readers’ cognition. Metaphor is a universal phenomenon in the language use, and has long been regarded as a figure of speech. The study of metaphor can be traced back to Aristotle in 2000 years ago. With the continuous development of linguistics, the Conceptual Metaphor Theory first proposed in Lackoff and Johnson’s book Metaphors We Live By in 1980 opened up a new course in the study of cognitive metaphor. In 2004, the concept of “translator centeredness” was put forward by Hu Gengshen in the book An Approach to Translation as Adaptation and Selection. He points out that the translators need to make selective adaptation and adaptive selection to the translational eco-environment, and conduct multi-dimensional transformation from at least the language, culture and communication dimensions, which has a strong significance to the translation and metaphorical translation studies. Under the guidance of Translation as Adaptation and Selection of Eco-translatology, this paper probes into the discussion of metaphor translation approaches in interdisciplinary popular science discourse with the metaphors from The Real Environmental Crisis as examples. Firstly, it gives a brief overview on the concept of metaphor, and makes a general analysis on the Eco-translatology theory and its application. Secondly, it divides four general kinds of metaphor in the interdisciplinary popular science discourse with the detailed example analysis from The Real Environmental Crisis, explores the essence of metaphor based on different kinds and sums up their impacts on the selection of translation approaches. Thirdly, it explores the “translation eco-environment” of metaphor in interdisciplinary popular science discourse and put forward the specific principles and five translation strategies accordingly: keep the original metaphor, keep the original metaphor and make notes and instructions, convert the metaphorical image, extend the metaphor and free translation. At the end of this paper, it makes a summary of the effects of the function of metaphors on the selection and implementation of approaches with the culture element, political element and context included, and points out the possible research significance and the room for improvement. ﹀
分类号：	H315.9
论文总页数：	204
参考文献总数：	41
馆藏号：	017/M2014(472)
公开日期：	2014-05-30

基于语料库统计的英汉连词省译研究——以《苏联语言政策》为例.李金蔓

链接

题名：	基于语料库统计的英汉连词省译研究——以《苏联语言政策》为例
姓名：	李金蔓
学号：	1101210738
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	连词省译英汉翻译语料库欧化
论文摘要：	︿作为语言的有机组成部分，连词在构建篇章方面有着重要用途。英汉两种语言中连词承担着相似的功能，是增强语言逻辑、丰富语言色彩不可或缺的手段。当前很多译者在英汉翻译中对连词的翻译重视不足，依照两种语言中连词的功能相似而僵硬地进行移植，忽略了英汉思维和文化的差异，更忽略了连词深层的作用和意义，造成了逻辑表达的僵硬。自 1976 年韩礼德和哈桑提出衔接理论以来，许多语言学家已从不同方面对衔接进行了研究。近几十年来衔接成分的翻译已逐步渗透到翻译研究领域，极大地推动了翻译理论与实践的发展。本文首先提出了如下研究问题：1）为什么会出现连词的省译；2）连词的省译遵循怎样的规则。经过对英语语料库、英汉翻译语料库和汉语语料库的统计分析，我们发现连词省译现象主要源自英汉两种语言组织特点不同——英语属形合，汉语属意合。在英汉翻译中，为符合汉语语言表达习惯，就需要适当减少连词的使用。笔者从连词的功能入手，分析连词在句子中承担不同的作用及其对翻译的影响。最后，通过定量和定性研究，笔者发现承担建构功能时，可以省略汉语中没有对应形式的从属连词；语义功能下，部分表示感叹和修辞的连词可以省略不译，汉语中以其他成分和语义内容表达出感叹的意味；发挥语用功能是，作为语义预设成分的连词不可省略，作为话语标记语的连词是否可以省译要看上下文语境间的管理是否紧密。但语言的生命力是无穷的，语言时刻都在发展变化，这里提出的策略不过是对既往翻译现象的提炼，不能涵盖所有情况，所以在使用策略时仍需视具体情况而定，切不可一概而论。最后笔者对提出的策略进行了实例验证，以《苏联语言政策》一书中的翻译实例，对比省译与不省译两种情况下译文的优劣。最终证明上述省译策略的合理性和可行性。﹀
分类号：	H087/TP391
论文总页数：	166
参考文献总数：	0
馆藏号：	017/M2014(555)
公开日期：	2014-05-30

科普翻译准确性与可读性的平衡研究——以《宇宙的100个关键发现》为例.郭萃

链接

题名：	科普翻译准确性与可读性的平衡研究——以《宇宙的100个关键发现》为例
姓名：	郭萃
学号：	1101210648
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
外文题名：	The balance research of accuracy and readablity of popular science translation
关键词：	科普翻译准确性可读性障碍平衡性翻译策略
外文关键词：	Translation of Popular Science Accuracy Readability barriers Balanced translation strategies
论文摘要：	︿科学传播旨在将科学知识传播并普及给普通大众。我们若想了解国外先进的科学知识，就需要引进大量科普类外文书籍，翻译成本国语言后供大众阅读，科普翻译也就成为连接中外科学传播的桥梁。但是由于语言文化差异等因素，在翻译过程当中产生这些问题：首先，市面上的科普读物质量参差不齐，常有译者错译、漏译以换取语言的通俗易懂，以致科普翻译准确性与可读性严重失衡，科普读物翻译的质量标准有待统一。其次，缺乏对科普翻译受众群体的细分，科普翻译策略实际操作性欠佳。再次，国内对于天文科普翻译可读性的研究不够深入，缺乏有效的翻译策略来解决翻译天文类科普读物时遇到的实际困难。笔者针对以上问题进行了如下研究。首先，笔者对准确性与可读性二者的定义及关系进行了辨析，论述了科普翻译时平衡准确性与可读性的必要性。通过分析可读性的影响因素及评价标准，笔者以《宇宙的100个关键发现》一书翻译实践为基础，建立了天文科普翻译质量标准。其次，笔者针对天文科普类读物出现的各种可读性问题，以传统的内容分类法为基础，按照可读性障碍产生影响的时间，给可读性障碍作出新的分类——即时障碍与延时障碍，以明确不同类别可读性障碍的重要性与优先性，以提高译者翻译效率。最后，笔者本着“以受众为目标，以平衡为导向”的处理原则，根据该科普书籍的内容细分受众群体，针对特定受众群体，找到准确性与可读性平衡的切入点，从而最大化地满足受众需求；在此前提下，依据本文建立的质量标准，综合运用直译、意译、增补、替代等多种翻译方法，按照可读性障碍影响因素发生的先后，按顺序依次分析并排除，最终达到译文准确性与可读性的平衡。笔者认为，采用这种平衡性的翻译策略，可以提高科普翻译的效率和质量，更好地向大众普及科学知识。﹀
外文摘要：	︿ Science Communication aims to disseminate scientific knowledge to the general public. To learn advanced foreign technology, we will need to introduce foreign popular science books and articles then translate into native language for the public. We see that popular science translation then becomes a bridge connecting science communication between China and other nations. However, because of cultural or linguistic differences, there comes some problems and situations we will have to analyze and discuss during translation process. Here are some examples. There is no “standard” quality of translated popular science books on market so that many of these books are only focusing on using simple interesting language but having a variety of mistranslation or missing translation problems, leading the accuracy and readability of popular science translations imbalanced. Or because of lacking in-depth analysis about readers and audiences, popular science translation strategies are short of specificity. Or because the readability barriers are only classified by contents, lacking of priority of those barriers, translators can’t really deal the actual problems in popular science translation. To possibly solve these problems, the author has been working on those following analysis and research. First, by differentiated and analyzed the concepts and relationship of accuracy and readability, the author states that it is necessary to find a good balance of accuracy and readability when translate science readings. According to the analysis of influence factors and evaluation criterion of readability, the author builds a new criterion of astronomy translating based on the experience of translating “Universe in 100 Key Discoveries”. Second, aiming at all kinds of readability problems in astronomy readings, on a basis of content classification of readability barriers, the author sets a new way to categorize these barriers into immediate and deferred barriers based on their time of occurrence. That clears their significance and priority and improves translating efficiency. At last, followed the process principle, “Centered in Public, Oriented in Balance”, the author subdivides the readers according to contents of the book, finds the balanced entry point of accuracy and readability by aiming at the readers, to maximize the satisfaction of audience. Under these circumstances and the new criterion of astronomy translating, the author uses literal translation, paraphrase, supplements and substitutes to analyze and remove all barriers according to the occurring order of influence factor of readability barriers, to reach the balance of translation’s accuracy and readability. Taking this strategy during translation, the author believes that it will improve the efficiency and quality of translation, thereby to disseminate scientific knowledge better. ﹀
分类号：	H087/TP391
论文总页数：	174
参考文献总数：	34
馆藏号：	017/M2014(627)
公开日期：	2014-05-30

服务于翻译教学的学习者语料库定量研究.韩林涛

链接

题名：	服务于翻译教学的学习者语料库定量研究
姓名：	韩林涛
学号：	1101210660
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	学习者语料库翻译作业翻译教学错误分类量化分析
论文摘要：	︿随着信息技术的不断发展，越来越多的翻译教师开始在翻译教学过程中应用新的技术工具和手段提高翻译教学效率和质量。翻译作业电子化是信息技术应用的直接结果，为翻译作业的量化分析提供了有力的基础。但目前尚没有任何一所学校开发或使用对电子化翻译作业进行量化分析的工具。本文从过往基于语料库的翻译研究成果出发，在现有翻译教学管理平台的基础上，自主设计并实现了基于翻译学习者语料库的翻译作业量化分析程序，并将该程序应用于真实的翻译作业分析。本文首先回顾总结了前人在基于语料库的翻译研究领域的研究成果，阐述了本文的重点、难点和创新点，提出了本文的研究问题。然后通过针对翻译教师的问卷调查和访谈从翻译教师的角度分析了翻译学习者语料库的量化分析需求，同时对现有国内外主流的翻译错误分类体系进行了介绍，并在分析其优劣势的基础上提出了一套新的翻译错误标注体系。然后本文邀请翻译教师使用北京大学开发的翻译作业批注器搭载新的翻译错误标注体系对翻译作业进行批改。批改结束后，本文使用自主设计的学习者语料库量化分析程序从语料库统计层面、成绩统计层面和错误统计层面实现了对翻译作业的定量分析，通过分析结果阐述了该定量分析程序对翻译教学的反馈作用。随后本文通过对一次翻译作业的批改实验结果的相关性分析和显著性分析验证了上述作业量化分析程序的翻译作业批改效果。最后总结了本文的研究成果，指出了本文研究的局限性和不足，从理论和实践两个层面展望了基于翻译学习者语料库应用于翻译教学研究的未来。﹀
分类号：	H087/TP391
论文总页数：	89
参考文献总数：	18
馆藏号：	017/M2014(693)
公开日期：	2014-05-30

《中国服饰变迁》中服饰文化的翻译研究.季梵

链接

题名：	《中国服饰变迁》中服饰文化的翻译研究
姓名：	季梵
学号：	1101210691
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	服饰回译时尚翻译
论文摘要：	︿《中国服饰变迁》一书，主要介绍的是明清两代至改革开发初期中国服饰的变化，其中清代及民国时期是主要介绍部分之一。而众所周知，《红楼梦》红楼梦以贾府为重点，描写了贾、王、史、薛四大家族的兴衰史，反映了封建社会晚期的社会现实，对于清代的服饰描写种类繁多、辞藻华丽。《京华烟云》则全景式再现了民国时期的风云变换，而其生动的服饰描写也令书中人物变得鲜活，是研究民国时期服装风格的极好范本之一。两书一由中译英，一由英译中，而所述时间又与本书相契合，将本书与二者相结合，进行对比研究，则可深入了解服饰词汇的回译技巧，也利于探寻最适合的中国服饰词汇英文翻译。国内对于服装问题的研究，已有许多成就，然而从翻译角度对服饰文化进行分析，少有综合的、宏观的总结性研究。本文将翻译《中国服饰变迁》时遇到的服饰内容与《红楼梦》、《京华烟云》中英版本中的服饰描写进行综合对比研究，将服饰词汇含义不对等的词汇分为含义区间不对称、含义区间偏离、具有隐含意义的三大类，以回译为方法，从分析原因、提出分类入手，在各类别的框架下进行翻译及回译研究，分析了《中国服饰变迁》中作者对中国服饰词汇的英译的优劣之处，评析英译准确度，并以明末至改革开放时期中国服饰词汇为重点，总结出准确的、普遍可行的、易于理解的中国服饰对等词及不对等词的翻译方法，并将词汇表及词义释析汇总附后，以期为服装历史研究者、服装设计者，以及对服装感兴趣的人后续的研究提供便利，让中国研究者们了解到外国学者眼里的中国文化和真实的中国文化的偏差，因此便能有针对性地进行修正，进而让国外学者更好地了解中国、研究中国。﹀
分类号：	H087/H315.9
论文总页数：	173
参考文献总数：	0
馆藏号：	017/M2014(764)
公开日期：	2014-05-30

语境及篇章对经济类文献翻译的指导.关赢

链接

题名：	语境及篇章对经济类文献翻译的指导
姓名：	关赢
学号：	1001210605
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	语境篇章翻译经济 AntConc
论文摘要：	︿译者翻译的书籍是Capitalism with Chinese Characteristics：Entrepreneurship and the State，本书是中国经济研究专家黄亚生的中国经济评论。虽然文章用平实的语言写就，但要想完全理解原文以便精确翻译，还需要借助语境和篇章.语境与篇章这两个语言要素在笔者翻译过程中起到了很强的指导作用。本文的一个创新点在于使用语料库语言学软件AntConc在翻译之前对文本进行了译前处理，进行了初步的语境发现和主题发现。这样可以在不阅读冗长原文的前提下快速掌握文章的核心语境。文章首先介绍了书籍的内容、背景和意义，然后介绍了作者的学术经历和工作经历。接着笔者提出了要研究的问题，介绍了研究意义与研究方法。笔者进行了文献综述，回顾了语境理论的源起并提出了自己的功能语境分类法。在下一章中笔者按照AntConc软件统计出的高频词频率顺序分别分析了几个最高频词的上下文，从而引导读者在未通读原文的情况下快速掌握宏观语境和作者的核心观点,即，民营民营乡镇企业在80年代和90年代早期在促进中国经济方面扮演了重要的作用，中国经济的成功应归功于此。接下来，笔者详细阐释了怎样用语境和篇章的观点指导翻译，分节依据就是笔者在第二章中提出的语境功能分类法，主要包括语境的语义层面、语境的语用层面、篇章互文性层面、非文本语境层面、读者语境层面。﹀
分类号：	H
论文总页数：	217
参考文献总数：	0
馆藏号：	017/M2014(64)
公开日期：	2014-05-30

MOOC与翻转课堂模式结合的课程设计与应用研究——以翻译技术课程为例.陈泽松

链接

题名：	MOOC与翻转课堂模式结合的课程设计与应用研究——以翻译技术课程为例
姓名：	陈泽松
学号：	1101210601
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	MOOC 翻转课堂计算机辅助翻译课程设计课堂活动组织
外文关键词：	MOOC Flipped Classroom Computer-adied Translation Curriculum Design Classroom activities organization
论文摘要：	︿信息技术深刻地改变了人们的生活方式，也再一次重塑了教育。MOOC与翻转课堂（Flipped Classroom）借力于数字化创新，颠覆了传统的教学模式，正在重新构筑课堂内外的学习价值，引起国内外研究者的广泛关注。北京大学于2013年初正式启动MOOC建设项目，翻译技术实践课程以此为契机，申请成为北大首批MOOC课程之一，对信息时代下的翻译技术教学加以审视，希望通过MOOC与翻转课堂结合的教育创新转变传统的翻译技术人才培养模式。笔者作为课程助教之一，在俞敬松老师的指导下参与了该课程MOOC建设和翻转教学的相关研究及实践工作。本文主要探讨了翻译技术实践课程在MOOC与翻转课堂混合模式下的教学设计与应用研究。论文首先介绍了MOOC与翻转课堂视角下的教育变革，提出两种教学理念在课堂内外的结合点及其现实价值。与此同时，本文在PACTE翻译能力模型的基础上，总结了新时代译者的翻译技术能力构成，进一步分析了翻译技术教学在课堂建构、教学策略上的困境，进而探讨翻译技术课程应用MOOC与翻转课堂混合模式的可行性。论文着力于解决翻译技术实践课程在教学效率、MOOC课程设计与翻转课堂活动组织三个方面的关键问题。本文参考了加涅、布鲁姆等人的教学设计理论，结合MOOC教学的相关研究，对翻译技术实践课程的教学目标和课程内容进行了拆解和重组，阐明了课程在MOOC环境中的整体设计方法。在此基础上，本文进一步探讨了MOOC教学与翻转课堂之间的衔接性，提出了“递进翻转教学”模式（Progressive Flip Teaching），通过组织主题式的师生交流活动和项目式的CAT软件竞赛，实现翻译技术实践课程在教学目标和教学策略上的逐层递进。本文是对翻译技术教学的一次创新发展，同时也能为MOOC与翻转课堂混合的教学实践提供有益的借鉴和参考。﹀
外文摘要：	︿ Information technology has profoundly changed people’s life，and resharped education once again. The rising of MOOC and Flipped Classroom, which get much attention around the world, are about to overturn the traditional teaching model and rebuild the value of learning inside & outside classroom by means of digital innovation. Peking University launched its MOOC Program at the beginning of 2013, Practice of Computer-aided Translation course took it as a chance to be one of the first 11 PekingX MOOCs. By examining the translation technology teaching methods in the information age, Practice of Computer-aided Translation course try to change the traditional personnel training on translation technology through MOOC and Flipped Classroom mixed model. The author, as one of the course teaching assistants, participating in the MOOC construction and flip teaching research under the guidance of Pro. YuJingSong. This article mainly discusses the instructional design and applied research of Computer-aided Translation course under the MOOC-Flipped Classroom mixed mode. First, this article introduces the educational reform from the MOOC & Flip Teaching respectively and proposes the in & out of the class juncture and the realistic value of the two teaching idea. Meanwhile, on the PACTE translation ability mode basis, the article summarizes translators’ skills constitution in translation technology, analyses the course constitution and instructional strategy difficulties of translation technology teaching and explores the possibility of applying MOOC-Flipped teaching mixed mode to Computer-aided Translation course. This article focuses on three key issues of Computer-aided Translation course on teaching efficiency, MOOC design and flip teaching events organization. This article references the Gagné and Bloom’s teaching design theory, disassembles and regroups the Computer-aided Translation course’s teaching objectives and course content under the MOOC teaching research and explains the course overall design methods in MOOC environment. On this basis, this article analyses the bridging between MOOC teaching and Flipped Classroom further, proses the “Progressive Flip Teaching” mode, to realize the Computer-aided Translation course progressive-like development on teaching objectives and teaching strategy through organizing the Topic-based teacher-student interaction and Project-based CAT software competition. This article’s content is not only an innovative development on translation technology teaching, but also a useful reference on MOOC- Flipped Classroom teaching practice. ﹀
分类号：	TP399
论文总页数：	137
参考文献总数：	57
馆藏号：	017/M2014(69)
公开日期：	2014-05-30

基于CATTP平台的wiki协作式写作和同伴互评研究——以技术文档写作教学为例.吴燕秋

链接

题名：	基于CATTP平台的wiki协作式写作和同伴互评研究——以技术文档写作教学为例
姓名：	吴燕秋
学号：	1101210976
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-30
外文题名：	The Study of Wiki Collaborative Writing and Peer Review on CATTP----A Case Study in Technical Writing Education
关键词：	技术文档写作 Wiki协作同伴互评
论文摘要：	︿技术文档是全球化产品的重要组成部分。在鼓励团队合作的当下，技术文档写作已不再是以个体知识为基础的写作模式，而是一种集体共同生产知识的写作方式。互联网Web2.0技术的发展对技术文档写作人才提出了新要求，也为我们培养和提高学生技术写作能力创设了优越的数字环境，使学习手段和方式更加丰富多样。如今，越来越多的教师已意识到协作式学习是技术文档写作教学的重点，但是在Web 2.0环境中把Wiki技术与同伴互评应用到技术文档写作教学中的研究几乎还处于空白。本文中，笔者提出在技术文档写作教学中可以采用Wiki技术和同伴互评方法。本文依托于北大语言系的《英语技术文档写作》与《双语编辑与写作》两门课程，在语言系的CATTP教学平台上，以教学实验的形式有逻辑地组织实验步骤、有计划地采集数据，并对CATTP平台数据日志和技术文档文本进行定量分析以及使用问卷调查进行定性分析的方式对实验结果进行分析研究。本文旨在探讨以下问题：能否在技术文档写作教学中采用Wiki技术和同伴互评方法让学生练习技术文档写作？通过本文的实验，学生掌握了哪些技术写作技巧？能否促进学生对技术文档写作的学习与认识？能否提高技术文档写作质量？有哪些优势和缺陷？对技术文档写作教学有哪些启示？本文的主要研究结论包括：把Wiki技术和同伴互评方法应用于技术文档写作教学具有可行性和有效性。学生愿意在技术写作练习中使用Wiki技术进行协作式写作以及进行写后同伴互评。研究认为，在技术写作中采用Wiki技术和同伴互评有以下几点优势：1）有助于锻炼学生的双语写作能力、用户需求分析能力、文档目标和内容分析能力；2）有助于培养学生的独立思考能力和批判性思维能力；3）有助于提高技术文档的写作质量。文档中除了文档大幅度减少单词、语法等表层错误之外，技术文档语言风格也更加显著，文档结构更加直观、清晰、有层次；4）有助于提高学生技术文档的写作水平，学生能够区别技术文档写作与普通写作的不同，掌握技术文档的写作风格和特点，包括文档风格、语篇结构、图表设计和用词、句式、时态、语态等。同时，研究发现，在技术写作教学中使用两种方法都要加强教师的监督和指导，以确保学生进行有效协作及对同伴的文档做出更为准确的评价。﹀
分类号：	H085/H315.9
论文总页数：	84
参考文献总数：	0
馆藏号：	017/M2014(77)
公开日期：	2014-05-30

《倔强的土地》一书中专有名词回译策略研究.于亚楠

链接

题名：	《倔强的土地》一书中专有名词回译策略研究
姓名：	于亚楠
学号：	1101211036
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
外文题名：	A Study of Proper Noun Back-translation Strategies Based on The Stubborn Earth
关键词：	专有名词回译翻译策略中国农学
外文关键词：	Proper nouns Back-translation Translation strategies Chinese agriculture
论文摘要：	︿本文研究内容源自《倔强的土地》一书的翻译实践，该书由斯坦福大学东亚研究中心的美籍研究员邵然蒂所著，于1989年在伯克利出版。该书讲述了美国农业学家直接接触中国农业的一段历史，而翻译此书难度较大的部分就是专有名词中人名、地名、机构名等的回译。专有名词，简称“专名”，在语言中的使用频率较高，是语言库的重要组成部分，其具备的主要功能就是能够表示特定的人物、地点等，以有别于其他同类人物或事物。随着时代的发展，专有名词日益庞大，涉及范围更加广泛，对它的理解和表达准确与否也成为判断回译文本可接受性的标准之一。　　本文比较了国内外对回译的定义，阐述了自身对回译的理解，并对已有的专有名词回译方法进行了梳理总结。通过比较多份参照译本，列举了目前专有名词回译中出现的典型错误示例，阐明了这方面策略研究的必要性和重要性，以引起研究人员及译者的关注。　　本文的重点一是在于根据总结的错误示例类型，从客观和主观两个角度全面阐述了影响专名回译的因素，前人对此鲜有系统性总结；二是结合自身翻译实践及积累的典型示例，深入讨论了专有名词回译的原则、策略，强调了网络搜索资源等现代翻译工具的重要作用，并首次提出了微加注和宏加注的加注法及专有名词回译三步法。最后，提出了分门别类建立专有名词数据库的建议等。但因目前国内外对专有名词回译方法的研究不多，本文的探讨期望对于中国农学历史类专有名词尤其是人名的回译提供一定的指导和参考。﹀
外文摘要：	︿ This translation project is based on The Stubborn Earth: American Agriculturalists on Chinese Soil, 1898-1937 written by Randall E. Stross and published in Berkeley in 1989. It reveals an important history in Chinese agriculture from 1898 to 1937 when the country received assistance from American agriculturalists. As the book contains a number of Chinese proper nouns, their translation back into Chinese poses considerable difficulty for this project. Whether they are translated accurately becomes an important criterion for judging the acceptability of back-translation. Therefore, proper noun becomes the object of this research project. 　　　Proper noun, also called proper name, is, in its primary application, a unique entity, such as the name of a particular person, place, organization, or thing. Since there is so far little research on the back-translation of proper nouns, this paper tries to provide some guidance and reference for the back-translation of proper nouns and hopes to be of use to the historiography of Chinese agriculture. 　　　This paper first compares the concept of back-translation home and abroad, offers a new explanation to it, and reviews the current methods of the back-translation of proper nouns. Next, through analyzing certain typical wrong translations of proper nouns in previous publications, this research advocates the necessity and importance of back-translation study in order to attract more attention from researchers and translators. 　　　 This paper goes on to analyze the subjective and objective factors which cause wrong translations. It further discusses the principles and strategies of back- translating proper nouns correctly and efficiently. The original contribution of this paper lies in that it offers two methods of annotation, namely micro-annotation and macro-annotation, as well as proposing to build a database of proper nouns. ﹀
分类号：	H315.9
论文总页数：	184
参考文献总数：	44
馆藏号：	017/M2014(108)
公开日期：	2014-05-30

IT类视频教程的配音和字幕翻译模式的学习效果研究.欧丽

链接

题名：	IT类视频教程的配音和字幕翻译模式的学习效果研究
姓名：	欧丽
学号：	1101210843
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	多媒体学习字幕翻译配音翻译学习效果
外文关键词：	Multimedia learning Subtitle translation Voice-Over Learning effect
论文摘要：	︿使用视频教程进行学习已经成为当下流行的学习方式。很多企业也开发自己的视频材料用于内部培训或吸引用户，因此视频教程的本地化也成为很具有潜力的领域。IT 类视频教程通常以软件的功能作为学习对象，使用声音解说配合屏幕操作的方式向学习者展现操作的步骤。目前对于这类视频教程通常采用的翻译方式是字幕或者配音，但是对于以学习知识为目的、在双语学习情境下，不同翻译模式对学习效果影响的现有研究很少。本研究使用来自 Lynda.com 的介绍 Excel 2010 Sparklines 功能的真实学习材料，对字幕和配音两种翻译模式的学习效果进行实验研究。本研究包括两个实验，实验一将真实的 IT 公司的员工作为被试，对该视频的中文配音版、与中文配音稿内容完全相同的长字幕版、在中文配音稿基础上缩减的短字幕版的学习效果进行研究。研究中发现，长字幕版的学习效果要明显好于配音版和短字幕版，而配音版和短字幕版的学习效果差异不明显。长字幕版带来的学习效果上的提升在高经验人群中差异明显，而在低经验人群中差异不明显。实验二将大学计算机和通信相关专业的本科学生作为被试，进一步研究字幕放置位置（居中和底部）和不同字幕翻译方式（长字幕版、短字幕版、断句版）对学习效果的影响。结果表明字幕居中时学习效果要显著由于字幕在底部的方式，并在字幕居中的三个版本中，翻译不同带来的差异并不显著。从为 IT 类视频教程选择翻译模式的角度来说，将配音稿进行翻译并制作成全文字幕是一种经济高效的选择。对于已经有配音翻译的教程，也可以考虑加入文字以提升学习效果。从学习者的角度来说，为了提升学习效果，应有策略的将注意力在屏幕操作和字幕信息之间进行合理的分配，在重点关注屏幕操作的同时，充分利用文字信息对学习的帮助。﹀
外文摘要：	︿ Watching video tutorials is a popular way of learning nowadays. Many companies are developing their own video tutorials for internal training or to attract users, which makes the localization of video tutorials becoming a very promising field. Generally speaking, IT video tutorials are designed for learners to study specific software functions through demonstration with narrated explanations. Currently the narrated explanations are commonly translated into subtitles or voice-over. However, limited researches have focused on the learning effect of the bilingual learning circumstance aiming at learning knowledge. In this study, a real course for Excel 2010 Sparklines feature from Lynda.com is localized to Chinese in several ways. Learning effect is compared for voice-over and subtitle versions in 2 experiments. In Experiment I, subjects from IT companies watched voice-over (narrated explanations in Chinese), long-subtitled (the full text equivalent to the narrated explanations in Chinese) and short-subtitled (the reduced version of the full text in Chinese) versions. The study found that long-subtitled version generated the best learning effect, especially in the experienced group, while short-subtitled and voice-over versions were not significantly different in learning effect in all groups. In Experiment II, college students from computer science and communications majors were selected as subjects. This experiment was designed to further investigate the learning effect differences between subtitle display positions (center and bottom) and translations (long-subtitled, short-subtitled and phrase-break-subtitled versions). The result indicated that the center group outperformed the bottom group; and in the center group, no significant difference was observed between translation versions in learning effect. From the perspective of selecting translation mode for IT video tutorials, full text translation as subtitle would be a cost-effective choice. For tutorials that already have the narrated explanations localized in voice-over mode, adding proper on-screen text can be a way to enhance learning. From the learner’s point of view, making the best of text content and strategically allocating attention between subtitles and demonstrations can help enhance learning. ﹀
分类号：	H315.9
论文总页数：	81
参考文献总数：	48
馆藏号：	017/M2014(131)
公开日期：	2014-05-30

国内语言服务业翻译技术认证考试的内容设计与实证.王聪

链接

题名：	国内语言服务业翻译技术认证考试的内容设计与实证
姓名：	王聪
学号：	1101210917
专业：	工程硕士
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
外文题名：	Content Design and Empirical Analysis of a Translation Technology Certification Examination for the Language Service Industry in China
关键词：	语言服务业翻译技术认证考试设计
外文关键词：	the language service industry translation technology certification examination design
论文摘要：	︿近年来随着信息技术的快速发展，翻译技术在语言服务行业的应用变得更为广泛。但目前国内语言服务业还没有专门的翻译技术认证考试，语言服务企业在进行人才评估时，多诉诸国内外已有的某些考试和认证项目，但这些考试并不能很好地适用于国内语言服务业。主要表现为三个问题：第一，与国内语言服务企业的职位技术需求不相符合。第二，考试的难易程度划分和梯度设置不合理。第三，工具软件考核的效度存疑。本文以语言服务业的现状和对人才需求为出发点，对已有的认证考试（CLP项目，SDL项目和MOS项目）和国内外开设计算机辅助翻译相关课程的高校在课程体系设置方面的经验进行分析总结，提出了面向国内语言服务业的翻译技术认证考试。本文首先从宏观上提出该考试的分级，分模块，以及理论实践相结合的设计原则，对上述三个问题提出理论解决办法。然后从微观上对翻译技术认证考试进行更为细节的设计，包括翻译技术认证考试的设计流程，考试大纲，双向细目表，题目编制和技术指标讨论等内容。最后以北京大学语言信息工程系近百名学生作为被试，进行了模拟测试和数据分析，以论证设计的合理性和可行性。本文基于国内语言服务业的调研报告，提出的“分级”设计原则，解决了问题一，即“与国内语言服务企业的职位技术需求不相符合”的问题。论文提出的“分模块”设计原则，使得双向细目表中的细目难度量化可控，从而不同模块组合成的完整试卷可以进行合理的难易程度划分和梯度设置，通过数据实测和计算，验证了问题二，即“考试的难易程度划分和梯度设置不合理”的问题得以解决。通过数据实测和计算还验证了“理论实践相结合”的设计原则，使得软件操作的考察具备更高的效度，能够可靠地度量被试实际具备的软件操作能力，从而解决了问题三，即“软件考核的效度存疑”的问题。﹀
外文摘要：	︿ With the rapid development of information technology in recent years, the application of translation technology in the language service industry is becoming more widespread. Currently, however, with no exclusive translation technology certification for the industry in China, the companies in the industry have to resort to some other examinations and certification programs, which do not fit properly. There are mainly three questions: first, the mismatch between exsiting examinations and domestic industrial requirements; second, the lack of control on the difficulty degree and separating capacity of the examination; and third, the doubtful validity of the tests on software operation. This thesis analyzes the current situation of the language service industry and its demand for the talents. Then the comprehensive lessons are drawn from the exsiting certification programs (such as the CLP program, the SDL program and the MOS program), as well as the related courses offered by the universities both home and aboard, this thesis comes up with a specialized certification test—translation technology certification test. Firstly, the thesis raised the design rules of classification, modularization and integration of theory and practice respectively as the solutions to the three problems above. Then in the following detailed design, the design procedure, the exam outline, the two-way specification table, the sample items, and the discussion on the technical indicators are accomplished. Lastly, mock tests were implemented on the students from the department of computational linguistics engineering, Peking University, to prove the rationality and feasibility of the test. Based on the survey report on the language service industry in China, this thesis solves the first problem with the design rule of classification. The difficulty degree of an individual item in the two-way specification table under the design rule of modularization is proved controllable by measured data, so the control on the whole difficulty degree and separating capacity of the examination is realizable, thus the second problem is solved. It is also verified by the measured data that the rule of integration of theory and practice is effective when applied in tests on software operation. This design rule ensures a high test validity so the actual software operation ability of the subjects can be reliably measured. ﹀
分类号：	H059/H315.9
论文总页数：	97
参考文献总数：	30
馆藏号：	017/M2014(178)
公开日期：	2014-05-30

《远东的灵魂》一书隐喻翻译策略研究——以认知隐喻为视角.刘兴颖

链接

题名：	《远东的灵魂》一书隐喻翻译策略研究——以认知隐喻为视角
姓名：	刘兴颖
学号：	1001210744
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	隐喻翻译策略认知对等
论文摘要：	︿本论文翻译对象《远东的灵魂》一书写于1888年，作者是美国学者帕西瓦尔·洛厄尔。该书以日本为远东地区的代表，为读者展现了一名西方学者思维中当时的东方社会，并且对比了东西方社会的差异，分析了产生这些差异的深层次原因。作者在论述时使用了较多隐喻来表达自己的观点，因此准确翻译该书的隐喻是重现作者观点的重要环节。关于隐喻的研究，从西方到东方从未停止过。自二十世纪七十年代莱科夫提出认知隐喻理论以来，隐喻的研究获得了巨大的发展。人类对隐喻的认识也从一种修辞概念向人类思维方式转变。认知隐喻相关的理论研究较多，但是用认知隐喻理论来指导隐喻英汉翻译实践的具体研究却并不多。因此本文尝试从理论结合实际的角度展开对隐喻英汉翻译的研究，以解决在翻译隐喻时遇到的问题。本文首先确定了研究对象，界定了本文所研究的隐喻范围，然后设立了以认知对等为翻译目标，对比了不同隐喻翻译方法的差异，概述了认知隐喻理论，在前人研究的基础上改进了认知翻译假设，增加了映射相似的情况，并将认知翻译假设与隐喻翻译基本方法结合起来，在两者之间建立起了以认知为决策基础的逻辑联系并将其应用到《远东的灵魂》一书的隐喻翻译中，将该书的隐喻分为映射条件相等、映射条件相异和映射条件相似三类，对每一类进行了具体划分，并针对性地提出了不同的翻译策略，以期为隐喻的英汉翻译实践提供一些借鉴。﹀
外文摘要：	︿ This translation project is based on The Soul of the Far East written by an American scholar, Percival Lowell, and published in 1888. Revealing a Western perspective on Japan, it compares the differences between the East and the West, and analyzes the roots of these differences in the author’s opinion. As Lowell uses many metaphors in the book, an accurate translation of these metaphors is crucial for expressing the author’s view. Since Lakeoff published his cognitive theory on metaphor in the 1970s, the study of metaphor has changed from rhetorical to cognitive, i.e. modes of thinking are regarded as important to the formulation of metaphors. However, though theories about cognitive metaphor abound, the study of translation practices guided by cognitive metaphor is few. This thesis thus tries to explore the English-Chinese translation of metaphor by way of the cognitive theory. This thesis first identifies the types of metaphor being studied, then introduces the cognitive metaphor theory, and establishes the aim of adopting this theory in translation as one of achieving cognitive equivalence between the source and target texts. Then the thesis compares traditional methods of metaphor translation in order to improve the cognitive translation hypothesis. Next the thesis classifies metaphors in this book into three types, namely, same mapping condition, different mapping condition and similar mapping condition. Matching them with specific translation strategies, the thesis finally applies them to the translation of the metaphors in The Soul of the Far East and proves their feasibility. It is hoped that this thesis can provide some reference for the English-Chinese translation of metaphors. ﹀
分类号：	H315.9
论文总页数：	150
参考文献总数：	28
馆藏号：	017/M2014(232)
公开日期：	2014-05-30

中美企业介绍文本的写作研究.顾学军

链接

题名：	中美企业介绍文本的写作研究
姓名：	顾学军
学号：	1101210645
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
外文题名：	Research on Writing of Corporate Profiles of Chinese and American Companies
关键词：	企业介绍人际关系翻译写作
外文关键词：	Corporate profile interpersonal relationship translation writing
论文摘要：	︿本研究以语体理论和认知社会语言学理论为指导，通过对中美企业介绍文本中人际关系的构建进行分析和对比，揭示了中美企业介绍人际关系构建在心理距离和行为倾向以及读者角色层面上的异同，并提出了中国企业介绍写作时在构建人际关系上需要采用的策略，从而帮助中国企业构建互动性良好而恰当的人际关系。本研究从语体角度入手，通过语体分析研究人际关系建构中的心理距离和行为倾向；从读者角色入手研究人际关系建构中的企业与员工关系建构背后的认知模式，并主要解决以下几个问题：企业介绍中的人际关系包括哪些方面，如何利用词汇语法的聚合描写作者-读者的心理距离和行为倾向，如何根据目标读者角色构建与读者的人际关系，中美企业对于与读者人际关系构建有何不同的认知模式。首先，对人际关系建构进行词汇语法层面的多维度分析，在构建中美企业介绍语料库基础上，得出语体维度的特征因子，并得到与人际关系构建的心理距离和行为倾向相关的两个维度：互动性和权威性。其次，在认知社会语言学理论框架下，对企业介绍文本的目标读者进行分类，并对其中的企业-员工人际关系描写进行了定量和定性相结合的分析。定量分析主要从描写企业-员工人际关系的不同概念隐喻入手，揭示出中美企业对于企业-员工人际关系的不同认知模式；定性分析则从一篇样本出发，对这些隐喻如何通过聚类和链构建描写企业-员工人际关系的语篇进行了分析。本研究的发现包括：在人际关系构建的心理距离和行为倾向层面，可以通过语体分析得出互动性和权威性维度。在互动性上，中国企业介绍文本的互动性要远远小于美国；而在权威性上，双方则比较接近。在企业-员工人际关系描写层面，美国企业文本更侧重于使用生物隐喻而中国文本侧重使用容器隐喻，而不同的隐喻则反映了中美对于企业-员工人际关系上不同的态度和认知模式。﹀
外文摘要：	︿ Based on style theory and cognitive social linguistics, this study analyzes the interpersonal relationship of China and US corporate profiles, revealing the similarities and differences in psychological distance, behavioral tendencies and writer-readers relationships, and proposes writing strategies on constructing interpersonal relationships to help Chinese enterprises build interactive and proper interpersonal relationship. This study analyzed the psychological distance and behavioral tendencies in interpersonal relationships by carrying on style analysis, and analyzes cognitive models behind the interpersonal relations from the view of reader-role based interpersonal relationships.The study focuses on the following questions: different aspects in interpersonal relationships , how to describe the psychological distance and behavioral tendencies between writers and readers by vocabulary and grammar aggregation, and how to build relationships with readers according to the target audience roles, what are the cognitive models behind the interpersonal relationships constructed in China and US corporate profiles. Firstly, the interpersonal relationships are studied by carrying out multi-dimensional analysis on the lexicogrammatical level, and two dimensions on psychological distance and behavioral tendencies are derived from the factors: namely, interactive dimension and authoritative dimension. Secondly, under the theoretical framework of cognitive social linguistics, the target audiences are divided into different categories, and corporate-staff relationship as one of them are analyzed by both quantitative and qualitative description. In quantitative analysis, different conceptual metaphors and their cognitive models of interpersonal relationships between China and US corporate profilesare analyzed, while in qualitative analysis, clusters and chains of metaphors are applied to study the functions of interpersonal relationships in discourses. Findings of this study include: on the psychological distance and behavioral tendency aspect, interactive and authoritative dimensions can be obtained through style analysis, and the Chinese corporate profiles are much lower in interactive dimension than US, while on the authoritative dimension both of them are identical. On the enterprise - employees relationship aspect, U.S. companies focus more on biological metaphor while Chinese companies on container metaphor; and different metaphors different attitudes and cognitive models respectively. ﹀
分类号：	TP311.5
论文总页数：	72
参考文献总数：	0
馆藏号：	017/M2014(233)
公开日期：	2014-05-30

可用性视角下技术文档的翻译研究.焦喜音

链接

题名：	可用性视角下技术文档的翻译研究
姓名：	焦喜音
学号：	1101210704
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-30
外文题名：	The Study of Technical Documentation Translation from the Perspective of Usability
关键词：	可用性技术文档翻译效率
外文关键词：	Usability Technical Documentation Translation Efficiency
论文摘要：	︿随着国外的产品大量涌入中国，技术文档作为产品的一部分，翻译需求也随之增加。技术文档翻译质量的好坏会直接影响到用户对产品的了解和使用情况。国内的技术文档规范尚不完善，翻译质量也不尽如人意，用户抱怨产品手册晦涩难懂的情况屡见不鲜。技术文档的目的不仅仅是为用户提供信息，还要指导用户执行操作。这要求技术文档的翻译不能仅仅停留在“读”的层面，还要深入到“用”的层面。基于这种情况，笔者确定了这一研究题目——可用性视角下技术文档的翻译研究。国内很多学者研究技术文档的翻译采用的是目的论以及功能翻译理论等传统译论。近年来，国际上已经有学者开始将可用性与技术文档翻译相结合。Anni Otava将可用性用于翻译，提出了广义上的翻译可用性准则。Jody Byrne提出了技术文档可用性的标准并在单语环境下验证了相似性连接能够提高技术文档翻译的可用性。在本文中，笔者首先通过可用性测试确定技术文档的翻译存在不足；为了进一步挖掘影响译文可用性的翻译问题，笔者分析了9份中英对照的技术文档，按照可用性的层次，即用户能够获取信息、用户获取信息的效率以及用户获取信息的主观满意度三个层次对翻译问题分类讨论。其中，传统的翻译问题能够利用通用的翻译策略得以解决，但部分问题是由于未充分认识到技术文档的特殊性和用户的特殊需求造成的，需要有更有针对性的策略。最后，笔者提出了四个更有针对性的策略：翻译时采用相似性链接、坚持任务原则和最简原则以及拆分译法。这些策略充分考虑到了技术文档的特殊性和用户的特殊需求，能够更有效地提高译文的可用性。关键词：技术文档；翻译；可用性；使用效率﹀
外文摘要：	︿ With huge amounts of products pouring into China, the demand for technical documentation is surging. The quality of its translation will directly affect user's understanding and use of the products. A good translation can win satisfaction from the users and increase user’s loyalty. At present, the translation quality of technical documentation is unsatisfactory. Users complain about it a lot. Technical documentation serves as a bridge between users and products. For users, readability is not enough, and they ask for usability of the translation. Therefore, the author decided to study the technical documentation translation from the perspective of usability. Many researchers have studied technical documentation translation with the guidance of Skopos theory and functional theories. In recent years, some foreign scholars have started to adopt usability into technical documentation translation. For translation usability, Anni Otava came up with ten general heuristics for translation. Jody Byrne furthered the notion of iconic linkage and proved it can improve the usability of English technical documentation. In this thesis, the author started with a usability test and found that the translation usability has something to improve. To fully find out the usability flaws caused by translation, the author analysed nine English documentations and its Chinese counterparts, and analyzed them according to the three levels of usability: the availability and clarity of information, the efficiency of the information and users’ satisfaction. Then the author analyzed the reasons. In the final part, the author came up with strategies more specific to technical documentation, which can better meet the demands of its users. ﹀
分类号：	H315.9
论文总页数：	222
参考文献总数：	2
馆藏号：	017/M2014(242)
公开日期：	2014-05-30

汉英翻译中植物词汇的处理策略研究.国梦影

链接

题名：	汉英翻译中植物词汇的处理策略研究
姓名：	国梦影
学号：	1101210656
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	植物词汇隐喻概念隐喻理论
论文摘要：	︿植物在人类的生活和生产中都扮演了相当重要的角色。一方面，植物的实用价值较高；另一方面，植物春日里摇曳的枝干、夏日里盛放的花朵、秋日里熟透的果实，这些大自然所赋予的性质和姿态都会引发诸多联想，因此人们会借助植物来委婉表达自己的思想，寄托自己独特的感情，抒发自己的情怀，指代各种自然植物的植物词汇除了其原本的植物学词义外，还产生了各种引申意义。文学作品中的植物往往都承载着丰富的文化内涵，比如《诗经》中描写的152种野生植物，其中包括麦、粟、荇菜、蒹葭、卷耳等等，文中多用植物比喻女子的形象或用于指代其美好的品质，此时的植物便被赋予了人格的文化意象。由于文化背景的不对等和目标读者的不同，这类词语往往翻译过后会在阅读美感和词义传达方面有所缺失。当前的植物词汇的研究文献一般可以分成两类，一类以呈现植物文化意义为主，从语言学或植物学的角度入手，探讨植物词汇的文化意义，或者某种植物的文化价值，其中包含和体现了植物的隐喻意义；另一类则是借助概念隐喻理论的研究角度，把植物隐喻作为一个有机整体来进行分析研究。基于以上的工作情况，本文创新性地以中国古典文学中出现频率较高的典型植物意象为例，分析了各种文本和语境下的桃、草、柳三种植物的隐喻意象和投射过程，最后落实到植物词汇的处理策略，以此进一步证明映射过程在翻译中的应用。笔者希望自己的分层分类方式能够为植物词汇的研究提供新的思路和方法，也为类似研究提供策略和方法层面的参考。﹀
分类号：	H315.9
论文总页数：	185
参考文献总数：	0
馆藏号：	017/M2014(334)
公开日期：	2014-05-30

商务函电模糊词分析及汉译策略研究——以The McGraw-Hill Handbook of More Business Letters为例.杨敏

链接

题名：	商务函电模糊词分析及汉译策略研究——以The McGraw-Hill Handbook of More Business Letters为例
姓名：	杨敏
学号：	1101211015
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	翻译策略模糊词商务函电
论文摘要：	︿当今世界，经济蓬勃发展，通讯事业不断进步，各国之间的联系越来越紧密，国际贸易发展也愈来愈欣欣向荣，商务函电作为国际贸易的桥梁也发挥着日益重要的作用。因此，商务函电手册的翻译也突显出其重要性。商务函电手册翻译的准确性直接影响着国际贸易的顺利进行。笔者将英文麦格劳—希尔商务函电手册翻译为中文手册，并制成双语商务函电电子版手册。这本双语函电手册既能让国人了解整个贸易流程和贸易中所需要的函电，又能让不精通第二语言的读者迅速通过关键词搜索，找到自己需要的对应函电，解决了用户的燃眉之急，并且减少了贸易中因为用词不当而产生的误会，同时，为广大商务人士提供了一本不可多得的商务函电实例学习辅助教材，对于商务函电的写作和功用有更深理解。笔者翻译作品The McGraw-Hill Handbook of More Business Letters是五百强企业麦格劳—希尔集团旗下著名产品。本研究以这本书为依托，加之其他辅助材料，剖析商务函电模糊词的译法。在商务函电的翻译中，笔者发现模糊词在函电中起着举足轻重的作用，如避免尴尬，防止信息外漏，缓解气氛等等模糊词作为一种灵活用语，自然是译者研究的重要对象，因此如何翻译好商务函电中的模糊词至关重要。然而，模糊词的把握是一个难点，经过笔者调查，国内研究函电模糊词的学者很少，提出的翻译策略也有很大的局限性。所以笔者在前人研究的基础上，通过自身的翻译实践以及对相关领域的学习研究，重新分类商务函电，研究其中的模糊词及其翻译，提出三条适用于商务函电模糊词英译汉的策略，通过商务函电的用途决策；通过商务函电的上下文语境揣摩；多义词多译策略。本文介绍了研究背景，包括学位翻译书目介绍、研究方法和研究重要性；模糊词的分类及文献综述，包括商务函电模糊词研究现状，研究局限性及笔者研究的创新点；分别任意选取四篇积极类、事实类和消极类商务函电，对其中出现的模糊词种类及数量进行分析，并对其翻译过程加以阐述，对不同种类商务函电中不同类型的模糊词实现情况作了详细说明，这让读者对模糊词的使用及翻译有了更深刻的理解；在知己知彼的基础上，得出笔者对商务函电模糊词英译汉的三条翻译策略。﹀
分类号：	H315.9
论文总页数：	167
参考文献总数：	0
馆藏号：	017/M2014(339)
公开日期：	2014-05-30

跨文化语境中旅游文本的直译加注策略分析——以《巴基斯坦和喀喇昆仑公路》为例.陈丹丹

链接

题名：	跨文化语境中旅游文本的直译加注策略分析——以《巴基斯坦和喀喇昆仑公路》为例
姓名：	陈丹丹
学号：	1101210586
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
外文题名：	Analysis of the Strategy of Literal Translation with Notes of the Tourism Text in Cross-Cultural Context
关键词：	跨文化文化缺省旅游文本直译加注
外文关键词：	cross-cultural cluture default tourism text literal translation with notes
论文摘要：	︿每种语言都具有地域性、民族性、历史性乃至民俗和文化等背景，因此在语际转换过程中，很难从目标语中找到与源语言等义的词，来表达其在原有语境中的文化含义，使目标受众能感受到同等效果。翻译是一种跨文化活动，文化缺省问题普遍存在，旅游文本的翻译也不例外。关于广义的文化缺省的翻译补偿，前人在理论方面的研究已经相对完善；然而，专门针对某种特殊文本，特别是旅游文本的翻译补偿研究寥寥无几，针对旅游文本的文化缺省特点进行翻译补偿的实践研究空缺急需填补。通过对旅游文本的特点及其文化缺省的分析，本人认为：跨文化语境中，旅游文本的文化缺省有以下特点：首先，旅游文本中大量具有文化内涵的词汇存在文化缺省现象，阻碍其译文读者了解原文文化内涵，融入旅游目的地文化。其次，文化缺省引发读者对于旅游文本译文的三大诉求：其一，译文应充分补偿原文的文化缺省问题；其二，补偿信息便于读者自主选择阅读；其三，保持文本的简洁性和易读性。最后，旅游文本的重要功能是使读者融入到旅游目的地文化中去，因此对于文化缺省的补偿具有很高的要求。在跨文化语境旅游文本的翻译中，如何针对上述特点对文化缺省问题进行补偿，以达到旅游文本的文化融入目的，是一个值得研究的问题。本文针对这一问题，对不同翻译补偿策略进行分析，并且通过实践证明：直译加注策略最适用于补偿旅游文本文化缺省问题，其能满足旅游文本译文读者文化融入的需求，对日后旅游文本的翻译实践具有指导作用。本文以孤独星球系列旅游指南《巴基斯坦和喀喇昆仑公路》的翻译为例，进行跨文化语境中旅游文本的直译加注策略分析，探究适合旅游文本的文化缺省的补偿策略，使得译文既补偿了文化缺省问题，又满足读者诉求，达到文化融入的目的。﹀
外文摘要：	︿ Every language has its own regionalism, nationality, historicality, and even other backgrounds like folk-custom and culture. Therefore, when doing translation, it is hard to find a word in the target language which is totally synonymous with the one in the source language, to express the original cultural meaning and to make target audience feel the same effect. Cultural default problem exists universally in the process of translation, so it also exists in the translation of tourism text. In regard to translation compensation of general cultural default, our predecessors have already gained a lot of achievements on theoretical side; however, the research of translation compensation aims at specific type of text, especially tourism text could hardly find: the vacancy of research on the translation compensation based on the characters of cultural default of tourism text need to be filled. Through the analysis of the character of tourism text and cultural default, I consider that in cross-cultural context, the cultural default of tourism text has the following character: first, it has a large amount of words which could cause cultural default. Second, the cultural default of tourism text could cause three demands of its target language readers: compensate of the cultural default; the compensation information is for their independent choice, and, is brief and easy to read. Third, the demand of translation compensation of tourism text is very high for the purpose of tourism text is to engage with the local culture. During the translation of cross-cultural tourism text, how to compensate cultural default according to above-mentioned characters, to achieve the goal of cultural integration, is a question that worth detailed study. Aiming at that question, through the analysis of different translation compensation strategies and translation practice, my paper comes to a conclusion: the strategy of literal translation with notes suits for the translation compensation of tourism text, it could fulfill the cultural integration demand of the target language readers and could provide guidance to the translation practice in the future. In this paper, the book Pakistan & the Karakoram Highway is used as an example, to analyze the strategy of literal translation with notes in tourism text in the cross-cultural context, thus to explore a strategy of cultural compensation which suits for tourism text, both solving the problem of cultural default and according with the request of the readers, to achieve the purpose of cultural integration. ﹀
分类号：	H315.9
论文总页数：	163
参考文献总数：	20
馆藏号：	017/M2014(340)
公开日期：	2014-05-30

软件本地化中的UI翻译策略.赵颖豪

链接

题名：	软件本地化中的UI翻译策略
姓名：	赵颖豪
学号：	1101211113
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	北京交通大学语言与传播学院
论文答辩日期：	2014-05-30
外文题名：	UI Translation Strategies in Software Localization
关键词：	软件本地化本地化翻译翻译翻译策略国际化 UI
论文摘要：	︿中国网站分析服务市场巨大，国外服务商纷纷涌入中国。在一次国外网站分析企业的软件本地化翻译实践后，笔者发现网站分析行业软件本地化的UI翻译质量不尽如人意。查阅现有相关文献，笔者发现本地化学者的研究多集中于本地化工程技术、本地化项目及质量管理、本地化教学与人才培养等，而翻译学者钟情翻译理论研究。一部分翻译学者涉足本地化研究，但由于缺乏对本地化行业的了解，其研究成果有的过于肤浅，有的过于理想，本地化翻译的研究仍有待加强。中国的网站分析服务市场有其特殊性。第一，国内监管混乱、审查严重。国外服务商也不敢贸然增资。第二，网络环境混乱，监测技术难度大。第三，尚未能实现大数据驱动模式，大数据地位不够。第四，本土和外来竞争对手多，适应力不一。第五，竞争对手价格竞争力强。因为这些原因，网站分析行业软件本地化特别强调成本问题。笔者在网站分析行业软件本地化翻译实践中发现，一味追求翻译质量对于重视投入产出比的商界来说并非上策。采取妥协态度，在成本一定的情况下追求最佳翻译质量才更加符合实际需求。通过充分发挥译者的主观能动性，笔者解决了翻译实践中的现实问题，而不影响软件本地化成本。经过总结与分析，笔者归纳出在成本一定的情况下，软件本地化中翻译软件UI时可以使用的几种策略，分别是UI规范化翻译策略、UI术语库翻译策略、减字策略、停顿切分策略、变换UI对话角色设定的策略以及借鉴受控语言思想的策略，以提高软件UI的可用性。﹀
分类号：	H315.9
论文总页数：	243
参考文献总数：	0
馆藏号：	017/M2014(477)
公开日期：	2014-05-30

语境在传记文本翻译中的重要作用——以《纳尔逊传：霍雷肖·纳尔逊的一生和他的传奇故事》为例.卢凤骄

链接

题名：	语境在传记文本翻译中的重要作用——以《纳尔逊传：霍雷肖·纳尔逊的一生和他的传奇故事》为例
姓名：	卢凤骄
学号：	1101210823
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	语境传记文本翻译情景语境文化语境
论文摘要：	︿传记是一种独特的文本类型，它记录传主的一生或者某个阶段的事迹，在文学、历史、教育方面都有很高的价值，有些传记甚至被直接当作史料。关于传记的翻译方法，前人提出了许多理论，而笔者从传记文本的特点出发，提出语境在传记文本的翻译过程中起到了重要作用。由于传记记载的都是前人的事迹，其所处社会、历史环境等都与译者有所不同，所以在理解原文时，译者要将原文放入对应语境中，才能获得完整的原文信息，在表达译文时，要根据译文读者的语境进行文字转换，这样才能确保原文信息不会流失，让译文读者能够跟原文读者有相同感受。本文以《纳尔逊传霍雷肖·纳尔逊的一生和他的传奇故事》为例，按照翻译过程，分别探讨原文理解阶段和译文表达阶段中，语境所起到的作用。在原文理解阶段，原文语境起主要作用，笔者从原文的上下文语境、情景语境和文化语境三类分别进行深入探讨，得出上下文语境有明确指示语所指、明确逻辑链条、明确词义的作用，情景语境有推断人物情绪以及推断语言含义的作用，文化语境则起到了补充逻辑信息、补充关键词信息以及补充特点信息的作用，而多种语境相结合时又起到了检验理解的正确性和推测未知信息含义的作用。在传记文本译文表达阶段，译者同时受到原文语境与译文读者语境的束缚，双方语境在这一阶段共同作用，限制着文字、句式以及内容增减的选择。﹀
分类号：	H315.9
论文总页数：	181
参考文献总数：	26
馆藏号：	017/M2014(560)
公开日期：	2014-05-30

Never in My Wildest Dreams中隐喻的翻译——基于博弈论视角.朱文佳

链接

题名：	Never in My Wildest Dreams中隐喻的翻译——基于博弈论视角
姓名：	朱文佳
学号：	1101211151
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	外国语学院
论文答辩日期：	2014-05-30
外文题名：	Metaphor Translation of Never in My Wildest Dreams---From the Perspective of Game Theory
关键词：	隐喻翻译博弈论
外文关键词：	metaphor translation game theory
论文摘要：	︿本文重点研究自传作品Never in My Wildest Dreams: A Black Woman’s Life in Journalism中隐喻的翻译。大量隐喻的使用是本书的一大特点，且书中的隐喻发挥着重要作用，包括使语言生动形象、准确传达作者思想和情感、讽刺以及篇章连贯。因此，研究透彻本书中隐喻的翻译就抓住了翻译本书的关键。本文首先从传记文本历史性和文学性的特点以及译本中隐喻发挥的重要作用出发，得出该译本中隐喻的翻译原则，即尽量保留隐喻所包含的文化特征。该原则要求译者兼顾保留隐喻的文化特征和译文可读性，并在两者发生矛盾时做出选择和取舍。基于此，笔者运用博弈论思想，从博弈论的视角看待隐喻的翻译问题，把隐喻的翻译问题转化成译者根据博弈信息在两者之间做出取舍，进而选择相应翻译方法的过程。笔者把Never in MyWildest Dreams中隐喻的翻译看作以译者为中心的两个双人博弈，即译者与原文作者的博弈以及译者与译文读者之间的博弈，其中译者扮演着双重角色，既是原文读者，又是译文作者，是在夹缝中生存的博弈方。博弈的策略是笔者根据前人研究和该译本的翻译实践总结出来的6种隐喻的翻译方法：直接移植喻体，转换成目的语中对应的喻体，扩展喻体、补充喻底，用明喻代替隐喻，保留喻体、增加注释，省略喻体、转换为意义表述。博弈的目的是使收益最大化，具体到该译本的隐喻翻译中，是指在忠实于原文的基础上，为读者提供可读性最好的译文。博弈的信息指译者为达到收益最大化所要考虑的各方面信息，而该译本隐喻翻译过程中需要重点考虑的信息是语言、文化和译本中隐喻的作用。在尽量保留隐喻所包含的文化特征这一翻译原则指导下，博弈的过程就是译者同时与原文作者和译文读者进行博弈，最大限度地运用博弈信息，在保留隐喻所包含的文化特征和译文可读性之间做出取舍，然后据此选择翻译方法的过程。在保留隐喻所包含的文化特征与译文可读性不矛盾的情况下，根据原语和目的语中隐喻的语言和文化是否相通，可以进一步划分为三类：语言相通、文化相通或无文化内涵；语言不相通，文化相通或无文化内涵；文化不相通。笔者在判断语言和文化是否相通的过程中，运用了语料库、Google等检索工具，增加了判断结果的可信度。以上三个分类分别对应的翻译方法是：直接移植喻体；明喻代替隐喻，转换成目的语中对应的喻体，扩展喻体、补充喻底；保留喻体，增加文内注释。这几种翻译方法既能保留隐喻所包含的文化内涵，忠实于原文，同时又不会影响译文的可读性，即能获得收益的最大化，达到博弈的目的。在保留隐喻所包含的文化特征与译文可读性矛盾的情况下，根据译本中隐喻是否具备讽刺或篇章连贯的作用，可以进一步划分为保留隐喻的文化特征和舍弃隐喻的文化特征。这两类分别对应的翻译方法是：保留喻体、增加脚注和省略喻体、转换为意义表述。保留喻体、增加脚注的翻译方法保留了隐喻的文化特征，舍弃了译文的可读性，没有达到收益的最大化，但却是在译者能力范围内、充分考虑原文作者和译文读者的需求下，所能获得的最大收益。而省略喻体、转换为意义表述的翻译方法虽然舍弃了隐喻的文化特征，但由于隐喻在译本中并没有发挥讽刺或篇章连贯的作用，也就是说在该译本中舍弃隐喻的文化特征，用意义表述的方法仍然能传达原文作者的思想情感。因此，这种方法表面上看没有忠实于原文，实则舍弃了形式，传达了情感，既忠实于原文，又保证了译文的可读性，获得了收益的最大化，达到了博弈的目的。﹀
外文摘要：	︿ This thesis focuses on the metaphor translation of an autobiography Never in My Wildest Dreams: A Black Woman’s Life in Journalism. Metaphors exist in each chapter of this book and play an important role. Thus, the key to translating this book is to analyze the metaphors and to weigh each word in translating these metaphors. This thesis first summarizesthe translation principle of the metaphors in the autobiographywhich is to try to retain the metaphors’ cultural meaning.The translation principle requires the translator to make a choice between retaining the metaphors’ cultural meaning and the translated text’s (TT) readability when the two contradicts with each other. Based on this choice-making process, the thesis introduces game theory, transforming the metaphor translation process into a choice-making process. In this thesis, metaphor translation of the autobiography is regarded as two two-person games: one is the game between the translator and the source text (ST) author; the other is the game between the translator and the TTreader. In these two games, the translator plays the dule role both as the ST reader and the TT author. The strategies of the game are the 6 translation methodssummarized based onother scholars’ researches and the translation practice of the autobiography: reproduce the same image in the target language (TL), translate the metaphor by simile, replace the image of the source language (SL) with the correspondent one in the TL, extend the meaning of the image and supplement the sense, retain the image and add notes, and convert metaphor to sense. The aim of the game is to maximize the payoff. Narrowing down to the metaphor translation of the autobiography, the aim of the game refers to be faithful to the ST and at the same time producing the TT with the best readability for the readers. The information of the game refers to all the factors the translator needs to consider to gain the maximum payoff. In the metaphor translation of the autobiography, what the translator needs to consider mainly contains three factors: language, culture and the roles of the metaphors in the autobiography. In these two simultaneous two-person games, the translator makes a choice between retaining the metaphors’ cultural meaning and the TT’s readability according to the information of the games. If there is no need for the translator to make a choice between retaining the metaphors’ cultural meaning and the TT’s readability, then the translator can choose from the following 5 strategies. When the linguistic and cultural factors of the SL and the TL are interlinked or no cultural connotation is contained in the image, the translator can reproduce the same image in the TL. When the linguistic factors of the SL and the TL are not interlinked, while the cultural factors interlinked or no cultural connotation is contained in the image, the translator has three choices—translate the metaphor by simile, replace the image of the SL with the correspondent one in the TL, or extend the meaning of the image and supplement the sense. When the cultural factors of the SL and the TL are not interlinked, the translator can retain the image and add notes within the texts. As for whether the linguistic and cultural factors of the SL and the TL are interlinked, the thesis uses corpus and Google to verify. These 5 strategies can not only retain the metaphors’ cultural meaning, but also keep the TT readable, which can gain the maximum payoff. In the case that the translator has to make a choice between retaining the metaphors’ cultural meaning and the TT’s readability, the role of metaphors in the autobiography acts as the key factor. If the metaphors play the role of sarcasm or textual coherence, then the translator should retain the metaphors’ cultural meaning and use the method of retaining the image and add footnotes; if not, then the translator can abandon the cultural meaning and use the method of converting metaphor to sense. The method of retaining the image and add footnotes makes the TT less readable for the readers, thus it can not gain the maximum payoff. However, this method is the best one within the translator’s capability and based on the needs of the ST author and the TT readers. The method of converting metaphor to sense abandons the metaphors’ cultural meaning, but it can also convey the ST author’s thoughts and emotions, whichcan achieve the maximum payoff. ﹀
分类号：	H059/H315.9
论文总页数：	182
参考文献总数：	33
馆藏号：	017/M2014(771)
公开日期：	2014-05-30

摄影术语分析及术语库建设.刘溢杰

链接

题名：	摄影术语分析及术语库建设
姓名：	刘溢杰
学号：	1101210814
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	外国语学院
论文答辩日期：	2014-05-30
关键词：	摄影术语翻译术语库
论文摘要：	︿在摄影大众化的今天，人们了解和学习摄影的兴趣日益浓厚，而掌握摄影术语则是摄影入门及进阶的关键。目前，国家名词委及其他相关权威机构尚未对摄影术语进行规范，且当前可利用的摄影英汉双语术语资源极为有限，在这有限的资源中，还存在不少问题。摄影术语翻译的不统一势必会给读者带来一定程度的误导及学习障碍。为解决这一问题，笔者从对摄影术语特点的分析入手，对摄影术语进行翻译以及验证。进而从术语的实际应用出发，从认知的角度考虑术语的可接受性，综合语料验证及特殊需求的考量，尝试建立一个比较规范的英汉摄影术语库。在摄影术语分析过程中，笔者发现摄影术语有其明显的时代特征，旧术语译法不可盲目沿用。摄影学科构成的复杂决定了不同领域间术语迁移的必然，而是否使用原领域的译法还需考量。此外，摄影圈文化也为摄影术语的表达提供了补充，摄影俚语的语用有其现实意义。在术语翻译过程中，笔者发现单词型摄影术语多通过隐喻的方式从一般词汇术语化而来，适用意译；而词组型摄影术语则多由已知摄影术语组合而成，适用直译。笔者还主张在摄影术语翻译中使用零翻译取代音译，因其更能满足效率需求及使用者的心理需求。在术语验证时，笔者认为web的生语料即可满足摄影术语翻译验证的需求，但需对语料范围进行限定，选择能反映摄影术语真实使用情况的语料。经过对摄影术语的分析、翻译及验证，笔者最终所建的英汉摄影术语库共收录英汉术语850个，其中核心术语161个，非核心术语689个。术语库与摄影知识系统相结合，涵盖摄影器材，摄影技术，后期处理以及摄影史论、俚语等方面的内容。这是一个面向实际应用的摄影术语库，具有以下两个创新点：一是额外收录不同于一般术语学意义上的术语，如“摄影俚语”、“摄影品牌”方面的术语，这是现存其他术语资源所没有的。二是与摄影知识系统相结合，体现在术语的标记及翻译上。个别术语在摄影的不同分面或者发展的不同时期其定义有所差异，笔者将相关的汉译一一给出，方便译者根据上下文语境有所针对地选择合适的译法。﹀
外文摘要：	︿ Photography, previously limited to a small group of people, has now become widespread among the general public.With that, there is growing interest in learning photography. A good knowledge of photography terms is fundamental during the learning process. However, currently English-Chinese glossaries of photography terms are limited in number and vary in quality. National Committee for Terms and other authorities have not yet standardized on the translation of photography terms. Within the few term sources that can be found in books or online, problems such as mis-translation are widespread. Inconsistencies in photography term translation between different sources may mislead or confuse readers. In order to solve this problem, the author first analyses photography terms from different aspects to better understand the characteristics of photography terms in general, which benefits term translation and translation verification that comes after. Taking acceptability and practical uses into consideration, and with the aid of web-corpus, the author aims to create an English-Chinese photography termbase that caters to real needs. During terms analysis, the author points out photography terms are strongly bonded to its historical background, thus the old translations of terms may no longer be appropriate today. Meanwhile, crossing art, science and craft, photography has imported many terms from a large variety of subjects. Whether to use the translations from their originating disciplines needs to be considered. Moreover, photography has its culture circle, within which various expressions for terms are growing. These slangs have their practical uses. For the translation part, the author finds that single-word terms mainly come from normal words by the use of metaphor. For these terms, free translation is suitable. For compound terms, literal translation using the translation of their component terms usually works. The author also proposes zero translation instead of transliteration, as zero translation is more efficient and will meet users’psychological needs. Intranslation verification, the author usesraw corpuson the web to verify terms translation. But the raw corpus needs to be selected, for it should reflect the real use of photography terms. After term analysis, term translation and translation verification, photography termbase built by the author includes 850 bilingual terms (161 core terms and 689 normal terms).Taking the whole photography knowledge system as reference, the termbase covers terms in photography equipment, photography techniques, post-editing skills and photography theory and slangs. This termbase is user-oriented with two specialcharacteristics: different from any other related sources, the termbase includes terms which may not be regarded as terms, such as photography slangs and famous brands in photography; the termbase is specialized for photography from the way the terms are marked and translated. Some terms have different translations in different facets of photography so related translations are given and marked for translators to choose from according to specific contexts. ﹀
分类号：	H31
论文总页数：	195
参考文献总数：	0
馆藏号：	017/M2014(105)
公开日期：	2014-05-30

英语隐性逻辑关系的汉译研究——以《信息未来，谁主沉浮？》为例.张诗玲

链接

题名：	英语隐性逻辑关系的汉译研究——以《信息未来，谁主沉浮？》为例
姓名：	张诗玲
学号：	1101211068
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	隐性逻辑英语意合语篇连贯《信息未来谁主沉浮？》
论文摘要：	︿由于翻译界长期盛行英文形合、显性的观点，译者在英汉翻译中往往无视英文的意合、隐性特征。而逻辑是最活跃、最重要的非语言信息，且相较于显性逻辑关系，隐性逻辑关系极易被忽略。当前诸多译者在英汉翻译中并不注重英语隐性逻辑，产生的译文颇有佶屈聱牙之嫌，而目前有关英语隐性逻辑的研究也鲜有着墨。笔者的翻译项目《信息未来，谁主沉浮？》为美国计算机科学家杰伦·拉尼尔的预言式著作，该书意义跳跃，逻辑隐晦，为英语隐性逻辑的研究提供了良好的语料素材。鉴于此，本文将基于翻译实践，对英语隐性逻辑关系进行深入剖析，提出并论证有效的翻译策略。本文首先批判性地对比了中英文语境下形合意合的异同，厘清了英语隐性逻辑的概念，同时全面梳理了缺乏关注的汉语显性现象。基于本人翻译项目，本文提出并阐释了英语隐性逻辑翻译的三个影响要素，即文本特点、语言差异及译者因素。结合项目中典型的隐性逻辑关系案例，本文创造性地构建了五大隐性逻辑处理策略：位置判断、语形识别、语内转换、语境分析及语用推理，并通过文本中的句内、句群与篇章隐性逻辑，以及因果、条件、假设、转折、递进等逻辑关系对这五个策略进行详细解析，充分例证了策略的有效性。由此，本文首次对隐性逻辑的概念、分布、翻译影响要素及翻译策略进行了系统的分析与论证，以期为英语隐性逻辑的汉译提供理论研究借鉴和实践参考价值。﹀
分类号：	TP311.5
论文总页数：	169
参考文献总数：	0
馆藏号：	017/M2014(409)
公开日期：	2014-05-30

基于语料库的中美高校简介对比研究.刘家良

链接

题名：	基于语料库的中美高校简介对比研究
姓名：	刘家良
学号：	1101210799
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	崔启亮
导师1单位：	软件与微电子学院
导师2姓名：	高志军
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	语料库目的论高校简介
论文摘要：	︿互联网已经成为全球最为重要的信息传播和交流媒介，网站是高等教育机构展示形象、吸引合作的重要窗口。高校充分利用本校网站传播自身信息，不仅有助于加强与国外高校、企业等组织的交流合作，还可在招收留学生和促进中外高校交流方面起到积极重要的作用。对于中国高校来说，若要给外国访问者留下较好的第一印象，网站的英文版建设就不可以忽略，其中介绍性质的部分相对而言最为重要。一般而言，英文版本的高校简介的多由中文版本翻译或改写而来，二者有着重要关联。但是，中美目标受众在信息获取习惯上存在差异，逐字逐句的翻译不是最佳传播策略。中国高校的英文简介撰写要符合外国目标受众的阅读习惯，帮助他们准确高效的理解相关信息。笔者从中美高校简介的对比出发，希望找出二者间的差异，从而发现我国高校简介的改进方向。本文以目的论为基础进行讨论，以高校简介的文本为研究对象进行对比分析。构建中美高校简介可比语料库，对比研究中国高校简介语料库、美国高校简介语料库。论文从文本基本形态，文本目的即言语行为，以及文化差异等角度进行讨论。使用了多个基本统计量、语言特征和高频词等语料库基本统计和对比方法。真实的数据不但反映了中美高校简介的总体轮廓差异，而且对人称代词、缩略语、情态动词、特殊疑问句、名化等具体的语言特征细节上发现了对比关系，而由词汇使用差异所蕴含的文化差异性更可能是传播效果达成方面最大的障碍之一。笔者由此认为，撰写英文的中国高校介绍性文字时，在保证语法用词的正确前提下，改进文本的客观性，增强其概括性、呼唤性，调整文化倾向性，是达到最佳信息传播效果所必须注意的几个方面。﹀
分类号：	H087/H319
论文总页数：	80
参考文献总数：	0
馆藏号：	017/M2014(57)
公开日期：	2014-05-30

基于威尔逊翻译模式的科普英文中带连字符的复合形容词翻译研究–以《科普天文学》为例.赵晓玮

链接

题名：	基于威尔逊翻译模式的科普英文中带连字符的复合形容词翻译研究--以《科普天文学》为例
姓名：	赵晓玮
学号：	1101211111
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	中国人民大学
论文答辩日期：	2014-05-30
关键词：	科普翻译带连字符的复合形容词威尔逊翻译模式翻译策略
论文摘要：	︿英文中带连字符的复合形容词词性搭配丰富，语义结构复杂，且很多都未收录字典，对于译者来说一直是个难题。对于该类词汇系统而正确的认知是译者进行该类词汇翻译的前提和关键。至今很少有学者专门对于复合形容词的翻译进行系统的研究，且鲜有学者涉及科普领域。本文从The Complete Idiot’s Guide to Astronomy的翻译实践出发，重点研究科普英文中带连字符的复合形容词翻译问题。本文将以威尔逊翻译模式为基础，探讨该模式指导下科普英文中带连字符的复合形容词翻译方法。威尔逊翻译模式是从认知描述语言学的视角出发，用数学意义上的对等映射关系阐释了翻译是如何进行的以及翻译过程中具有都有哪些要素发生了映射。威尔逊认为在翻译过程中译者需要将尽可能多的要素纳入该类词汇汉译的映射过程中，包括含义、语言结构、内涵以及语境等。这些要素也正是笔者在翻译本书中带连字符复合形容词实践中发现所需要考量的重要要素。在本文中，首先，笔者对科普英文中带连字符的复合形容词在词汇概念和词序结构两个方面所具有的特点进行了总结。其次，针对以上特点，为了帮助译者对于该类词汇的形成准确全面的认知，笔者在威尔逊翻译模型的基础上进行了适当改进，以提高科普文章中该类词汇翻译的准确性。具体改进包括：1）优先映射原则，适用于科技术语及惯用表达； 2）对威尔逊模式中的各个要素在该类词汇翻译中的适用性进行了具体分析；3）对科普文章中复合形容词词序结构同汉语词序结构进行了详细对比。最后，笔者对适用于该类词汇的翻译方法进行归纳总结。﹀
外文摘要：	︿ Hyphenated adjective compounds are rich in their combinations and complex in their semantic structures, and many are not yet included in the dictionary. Such characteristics have always been a tough problem for the Chinese translators and yet few scholars have focused on the English Popular Science (EPS) field. This paper focuses on the translation of hyphenated adjective compounds in the EPS text based on the translation practice of The Complete Idiot’s Guide to Astronomy. Based on Wilson’s translation model, this paper discusses the specific translation process of hyphenated adjective compounds in EPS text. Wilson’s model illustrates how translation occurs and what occurs during the translation process through a cognitive descriptive approach. It is suggested to take as many components into consideration as possible in translation, including linguistic meaning, linguistic structure, connotation and context，etc. All of these components are also necessary elements that should be considered in translating the hyphenated compound adjectives in the EPS text. In this paper, firstly, the author summarizes the unique features of hyphenated adjective compounds in EPS text from the perspectives of linguistic concept and linguistic structure. Secondly, based on the above features, the author adapts Wilson’s translation model into the translation of hyphenated adjective compounds in EPS text and makes some improvements compared with the original one. The improvements mainly include: 1) Priority mapping, which can be applied for those technical terms and conventionalized expressions; 2) A detailed analysis of the mapping of each component during the translation process of hyphenated compound adjectives in EPS text; 3) A comparison of word order between English and Chinese. Finally, the author summarizes the translation strategies applied to hyphenated adjective compounds in the EPS text. ﹀
分类号：	H31
论文总页数：	208
参考文献总数：	0
馆藏号：	017/M2014(433)
公开日期：	2014-05-30

基于归化和异化的文化背景翻译策略研究——以The Big Screen: The Story of the Movies and What They Did to Us为例.夏智琳

链接

题名：	基于归化和异化的文化背景翻译策略研究——以The Big Screen: The Story of the Movies and What They Did to Us为例
姓名：	夏智琳
学号：	1101210979
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	人民大学外国语学院
论文答辩日期：	2014-05-30
外文题名：	The Translation Strategies of Cutural Background Depending on the Domestication and Foreignization--A Case Study of The Big Screen: The Story of the Movies and What They Did to Us
关键词：	文化背景归化异化翻译障碍翻译策略
外文关键词：	cultural background domestication and foreignization barrier of translation translation strategies
论文摘要：	︿语言，作为民族文化的重要组成部分，是文化的载体和表现形式之一，同时也是文化继承和传播过程中最重要的手段。不同民族之间虽然语言和文化背景各异，但认知态度和方法却基本类似。因此，若解决文化背景障碍，则人们可以通过翻译实现沟通和交流。固在翻译过程中，文化背景知识的处理具有十分重要的意义。当前很多译者在翻译过程中不重视文化背景的处理，使得读者无法准确理解原著内容，造成文化缺失。西方学者自上世纪中叶起，开始注重文化内涵的翻译策略，奈达、纽马克等皆提出了相应的文化翻译理论。我国文化翻译研究起步较晚，虽然部分学者探索出了适合英译汉的文化翻译理论，但到目前为止还未出现成体系的文化翻译策略。本文基于翻译The Big Screen: The Story of the Movies and What They Did to Us 过程中遇到的文化背景翻译障碍，以研究此类障碍的解决策略和规律为目标，通过具体的翻译工作，在前人的理论基础上对译本中大量译例进行分类研究，将障碍分为词义空缺、语义空缺和文化空缺，与归化异化理论相结合，总结出可以应用于实际翻译行为的策略和普适性规律，并进行了可行性和普适性验证。本文成果将对研究由文化背景产生的翻译障碍问题起到借鉴作用。﹀
外文摘要：	︿ As an important element of national culture, language is one of the carriers of culture and manifestation, and it is also the most significant method of cultural transmission and communication. Although different nationalities have different culture and languages, they have the same cognitive approach and attitude. So in order to solve the barrier of cultural background, people can communicate through translation. As a result, how to deal with the barrier of cultural background when doing the translation is very important. However, many translators pay little attention to the barrier and it makes readers unable to understand the original work accurately, so the culture defaults come out. From the middle of the last century, western scholars began to focus on the translation strategy of cultural connotation. Both Nida and Newmark proposed corresponding cultural translation theories. However, China started late on the cultural translation theory. Although some scholars summarized the cultural translation theories which applied to the Chinese language system, until now the strategy hasn't into the system yet. This paper researches into the barrier of cultural background in the translation of The Big Screen: The Story of the Movies and What They Did to Us. With the purpose of studying the strategies and rules that can be used to address the barrier, this paper researches on a large quantity of translation examples on the basis of the existing theories, and classifies the barriers into lexical gap, semantic void and cultural vacancy, hence it proposes a solution based on the theory of domestication and foreignization, then summarizes the strategies and universal rules that can be applied into actual translation, and verifies the feasibility and universality. The result will be deemed as a reference for any research of the translation barrier arising out of the culture background. ﹀
分类号：	H315.9
论文总页数：	164
参考文献总数：	42
馆藏号：	017/M2014(98)
公开日期：	2014-05-30

英汉翻译中的“不译”策略研究——以《言论的权力：生活中的语言政治》为例.何昕

链接

题名：	英汉翻译中的“不译”策略研究——以《言论的权力：生活中的语言政治》为例
姓名：	何昕
学号：	1101210671
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	柏晓静
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	“不译”策略源文文本分析译文受众分析语言和文化差异
论文摘要：	︿传统观念中，称职的译者应该运用所学将源文中所有内容一字不落地翻译成目标语，否则其翻译就会认为是处于未完成状态。传统翻译策略研究多聚焦于“怎样译”这个问题上，而忽略了此问题的前提条件，即“译”与“不译”问题。笔者认为根据文本类型和译文受众情况，有时候保留原文“不译”所达到的翻译效果比“译”更到位。目前国内外对“不译”策略的研究主要存在两个问题。其一，“不译”概念本身界定不明确，和“不可译”“省译”“零翻译”等概念经常混为一谈；其二，已有的“不译”策略研究通常局限于孤立的词（如术语、人名）或句子（如谚语、广告词），没有联系到具体的原文语境以及译文读者的阅读需求和阅读水平，导致所得的“不译”策略应用性不强，对译者指导意义不大。基于上述问题，本文从一本美国社会语言学读物的英译汉实践出发，借鉴和参考国内外学者对“不可译”、“可不译”、“零翻译”等问题的研究，对“不译”这个概念的内涵和外延进行阐释。在主体部分结合翻译实例，通过原文文本和译文受众分析，对“不译”翻译策略的若干问题进行探讨：能不能译、应不应译、具体怎么“译”和“不译”等，并在最后尝试对影响“不译”策略的语言和文化因素进行分析。﹀
分类号：	H087/TP391
论文总页数：	182
参考文献总数：	0
馆藏号：	017/M2014(597)
公开日期：	2014-05-30

第二人称代词在科技英语中的汉译策略研究–以《技术写作101》为例.杨莹

链接

题名：	第二人称代词在科技英语中的汉译策略研究--以《技术写作101》为例
姓名：	杨莹
学号：	1101211022
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	柏晓静
导师1单位：	软件与微电子学院
导师2姓名：	张宏岩
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-30
关键词：	第二人称代词科技英语翻译策略语料
论文摘要：	︿人称代词作为语言的有机组成部分，虽然结构用法相对简单，但在指示语的研究中占有重要地位。第二人称代词在英汉两种语言中承担类似的衔接功能和人际功能，但由于依赖程度不同，在英汉翻译转换时呈现一定的特点。同时科技英语翻译在近年来也得到广泛重视，但许多译者在翻译第二人称代词时只是一味简单地直译，造成冗余，这促使笔者尝试探讨其它有效的翻译策略。本文在《技术写作101》翻译项目的基础上，将第二人称代词与科技英语翻译结合研究，探讨了第二人称代词的常规用法和非常规用法，并尝试分析影响翻译科技英语中第二人称代词的三大因素。本文采用语料统计法讨论了第二人称代词在科技文本中的显化和隐化现象，语料包括北大CCL双语平行语料库中的科技文本和两个科技翻译项目。笔者尝试总结了第二人称代词在科技英语翻译过程中的五种翻译策略：省略法、具体指代法、泛指替换法、角色转换法和直译法,并使用翻译项目中的实例依次进行阐述。﹀
外文摘要：	︿ As an integral part of written language, personal pronouns play an important role in deixis research field, though they have a relatively simple structure and usage. While the second personal pronouns in both English and Chinese share the similar cohesive and interpersonal functions, they demand special treatment in E-C translation due to their different reliance extent on sentence structure. With the increasingly wide use of EST (English for Science and Technology) in recent years, the EST translation has been paid much attention. The translation of second personal pronoun in EST is merely literal translation, which may cause redundancy and leads to the study of an otherwise strategy. This paper is written upon the E-C translation practice of Technical Writing 101. With the analysis of conventional and unconventional usage of the second personal pronoun as well as its three main factors affecting the translation, the paper focuses on the research of the second personal pronoun in EST translation. Based on the science and technology texts of PKU CCL English-Chinese parallel corpus and the other two EST translation projects, this paper discusses the explicitation and implicitatioin of the second personal pronoun with the method of statistical analysis. It is concluded that there are five main strategies of translating the second personal pronoun in EST, including ellipsis, specific referring, replacement, role transplantation and literal translation, which are all explained with examples. ﹀
分类号：	H085
论文总页数：	211
参考文献总数：	28
馆藏号：	017/M2014(351)
公开日期：	2014-05-30

移动应用的本地化翻译研究#xB;—以WeChat Android 5.0.3为例.严敏

链接

error

语义韵视角下英语新闻报道中模糊限制语的翻译研究.王蕾

链接

题名：	语义韵视角下英语新闻报道中模糊限制语的翻译研究
姓名：	王蕾
学号：	1101210928
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2014-05-30
外文题名：	Research on the Translation of Hedges Used in News Report from the Perspective of Semantic Prosody
关键词：	语义韵新闻报道模糊限制语语用功能翻译
外文关键词：	Semantic prosody News Report Hedges Pragmatic Functions Translation
论文摘要：	︿新闻报道的功能是真实准确地报道客观发生的事实。新闻报道中的语言要求准确、简洁。模糊限制语（Hedges）是附加在意义明确的表达形式之前的词或短语，可使本来意义明确的概念变得模糊。在新闻报道中使用模糊限制语，恰恰可以使新闻的描述更符合事实，并能更准确地反映出新闻的主旨和新闻作者的思想。目前，针对新闻报道中模糊限制语翻译的研究主要集中在语义学和语用学两个方面，从语义韵的角度研究模糊限制语翻译问题的学者还比较少。本文作者尝试从英语国家语料库（British National Corpus，简称BNC）的新闻报道分库中检索出使用频率较高的模糊限制语，并根据何自然先生对模糊限制语的分类，选取20个常用的模糊限制语，利用Sketch Engine的词汇速描（Word Sketch）功能和人工标注，对这20个模糊限制语进行语义韵层面的分类，并分析不同语义韵的模糊限制语产生的语用效果。此外，本文作者编写了一个网页抓取工具，从“可可英语网站”抓取了737篇BBC英语报道，并利用雪人CAT软件的记忆库功能，创建了一个包含24,399个句对的双语平行语料库，用于分析新闻报道中模糊限制语的翻译情况。本研究发现英语新闻报道中常用的模糊限制语有119个，分为积极语义韵、消极语义韵和中性语义韵三大类。具有积极语义韵的模糊限制语可以增强说话者的情感强度，委婉地表达说话者的情绪或观点和增强新闻主题说服力；具有消极语义韵的模糊限制语具有顾及听者的面子和缓和语气的功能；具有中性语义韵的模糊限制语用于表现说话者谦虚的态度，确保新闻的准确性和真实性，使报道更符合读者的阅读习惯和削弱说话者的肯定语气。通过对自建双语语料库中模糊限制语翻译的分析可知，有些译者利用中文的地道表述在译文中体现了模糊限制语的语义韵；有些译文因意译、漏译或错译等问题，导致了原文中语义韵产生的新闻情感被削弱等问题。本文通过分析新闻报道中模糊限制语语义韵的语用功能和翻译现状，使译者意识到模糊限制语的语义韵在新闻报道中的重要作用，保证新闻的有效传播。﹀
外文摘要：	︿ The main function of news report is to report what’s happened objectively and accurately. The language of news report needs to be accurate and concise. The Hedges are words and phrases attached to the explicit expressions, which blur the definite concepts. The vague language is a kind of flexible language, which refers to the extension of uncertainty and the indefiniteness of the connotation. The vague language used in the news report makes information consistent with the facts described in the news, and can reflect the thrust of the author’ ideas more accurately. Currently, the research on the translation of Hedges in news report mainly focuses on their semantic and pragmatic features. Few scholars have done research from the perspective of semantic prosody. The author attempts to search out the Hedges used in news report from the Newspaper branch of the English National Corpus (NBNC) and selects twenty Hedges according to Mr. He’s classification of Hedges. The collocations of the selected Hedges are tagged by the “Word Sketch” function of the Sketch Engine. Then, the selected Hedges are divided into positive semantic prosody, negative semantic prosody and neutral semantic prosody. The author also analyzes different semantic rhymes of the Hedges. In order to analyze the translation of the Hedges from the perspective of semantic prosody, the author writes a web crawler to crawl the Cocoa English website, downloads 737 pieces of BBC news reports, and uses the Translation Memory(TM) function of the Snowman CAT software to create a bilingual corpus, which contains 24,399 pairs of bilingual sentences. There are 119 Hedges often used in news report which can be divided into positive semantic prosody, negative semantic prosody and neutral semantic prosody. Hedges with positive semantic prosody can enhance the speaker’s emotional intensity, express the speaker’s emotions and ideas politely and make news report more persuasive; Hedges with negative semantic prosody can be used to save the listener’s face or moderate the tone of news report; Hedges with neutral semantic prosody are used to show the speaker’s humility, ensure the accuracy and authenticity of the news report, make the news report more in line with the reader’s reading habits and weaken the speaker certainty. Through the analysis of the translation of Hedges used in the self-built bilingual corpus, we know that some translators express the semantic prosody of Hedges through the idiomatic Chinese-expression, while some translators fail to convey the emotion of news report, due to free translation, mistranslation or missing translation. This paper has explored the translation and the pragmatic functions of Hedges’ semantic prosody used in news report, which aims to make translators conscious of the importance of Hedges’ semantic prosody and ensure the effective dissemination of news report. ﹀
分类号：	H315.9
论文总页数：	85
参考文献总数：	57
馆藏号：	017/M2014(418)
公开日期：	2014-05-30

纪实文本中情感表达的翻译策略研究——以A Problem from Hell: America and the Age of Genocide一书翻译为例.汪炜

链接

题名：	纪实文本中情感表达的翻译策略研究——以A Problem from Hell: America and the Age of Genocide一书翻译为例
姓名：	汪炜
学号：	1101210912
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	信息科学技术学院
论文答辩日期：	2014-05-30
关键词：	纪实文本情感表达翻译策略
论文摘要：	︿本文是基于译者对于《来自地狱的难题：美国和种族灭绝时代》（A Problem from Hell: America and the Age of Genocide）一书的翻译，研究在纪实文本中作者如何从词汇和句型的层面表达情感，总结了原文不同于新闻文本和政治性文本的词汇和句型特点，从情感表达的角度出发，以案例说明译者应当采取何种翻译策略。翻译原文作为一部纪实文学作品，原文的用词具有纪实文学简单、明了、直接的特点。纪实文本朴实的用词风格，和作者想要表达的强烈的人道主义情感产生了矛盾，而作者巧妙地利用词的联想含义和利用词的本身含义（情感词）两种方式，来间接地渲染客观情绪、映射主观情感。译者用案例总结和分析了原文中利用词的联想含义表达情绪的例子，并进行了例证分析。为了对情感词进行提取、分类和翻译进行了如下工作：（1）通过词性标注软件Stanford POS Tagger对翻译原文中词汇的词性进行了标注，（2）利用Python的正则表达式提取出了翻译原文中的高频词，（3）根据美国匹兹堡大学的MPQA情感词词典，筛选出了含有情感意义的形容词、动词和名词，从情感表达的层面对这些情感词的翻译进行了分析和研究。从文体和词汇层面，原书兼具了政治性文本、新闻文本的双重属性，本文分析了这两种文本和原书作为纪实文本词汇方面的相同点和不同点，并由此讨论了相应的翻译策略。从句型层面，本文分析了作者通过句型来表达情感的方式，并以案例说明了相应的翻译策略。作者用简单句来更加直接地表达观点，原书中的复杂句在特定情况下译者处理成为了简单句；通过排比问句来加强语气、层层提出质疑、表达愤慨和批判的情绪，译者进行处理之时还需要注意避免用词重复；通过被动句的句型来突出表达平民在面对屠杀时没有还手之力的情形，强调施暴者的残忍，抒发人道主义情感，译者翻译之时需要考量被动句型是否要变为主动的问题。通过研究作者的写作手法，本文梳理了译者应当采取的翻译策略。﹀
分类号：	H087/TP391
论文总页数：	167
参考文献总数：	32
馆藏号：	017/M2014(639)
公开日期：	2014-05-30

英汉翻译时主语重新择定技巧——以《城市绿色增长》为例.孙慧杰

链接

题名：	英汉翻译时主语重新择定技巧——以《城市绿色增长》为例
姓名：	孙慧杰
学号：	1101210888
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2014-05-30
关键词：	主语重新择定翻译技巧施事主语当事主语
论文摘要：	︿由于英汉两种语言形合和意合差异导致在句法结构上主语突出与话题突出不同、思维方式差异导致具体充当主语成分上重物称主语与重人称主语不同，英汉翻译时英语主语有时无法直接作为翻译的汉语主语，而需要重新择定汉语主语。本文根据黄伯荣和廖序东在《现代汉语》中将汉语主语分为施事主语、受事主语、当事主语这一分类，试图从目标语汉语的主语分类出发，探讨英汉翻译时重新择定汉语主语的技巧。由于汉语受事主语句或是源于常在汉语自然口语中出现的句法移位，或是保留了英语主语的汉语被动句，而本文所基于的翻译材料《城市绿色增长》作为书面研究报告含有极少需要改变英文主语的汉语受事主语句，因此本文主要探讨英语主语变为汉语施事主语和当事主语两种。由于汉语是动词优先的语言，因此英汉主语翻译时首先识别原文是否含有施动动词，根据“施事——受事”结构出发，凸显施事主语；当原文不含有“施事——受事”结构时，再从“话题——说明”结构出发，凸显汉语话题，即当事主语。考虑到汉语无主句这一特殊现象，因此也探讨了汉语省略主语的情况，但由于汉语无主句必然有施动动词作谓语，因此本文将汉语无主句归类在“施事——受事”结构下讨论，视为省略施事主语句。关于英语大量采用形式主语“it”这一特殊现象，由于“it”作形式主语时只是代行句法功能而无词汇意义，除非译为汉语无主句，常需要根据语义找出真正话题对象做主语，因此归类在“话题——说明”结构下讨论。通过上述方法将英汉翻译时主语重新择定技巧简化为两大类，以期具有借鉴意义。﹀
分类号：	H059
论文总页数：	182
参考文献总数：	51
馆藏号：	017/M2014(92)
公开日期：	2014-05-30

医学英汉翻译实践中模糊语的处理——以《营养学——知识与运用》为例.王崇毅

链接

题名：	医学英汉翻译实践中模糊语的处理——以《营养学——知识与运用》为例
姓名：	王崇毅
学号：	1101210916
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	王雷
导师1单位：	外国语学院
论文答辩日期：	2014-05-30
外文题名：	A Study of the English-Chinese Translation of Fuzzy Language in Medical English--Exemplified by the Translation of Nutrition--Science and Applications
关键词：	医学英语英汉翻译模糊语处理原则
外文关键词：	Medical English English-Chinese Translation Fuzziness Principles
论文摘要：	︿学术界关于模糊语的讨论涵盖了模糊语的界定、分类、功能及翻译策略等方方面面。然而，这些讨论中仍有不少亟待解决的问题。首先，学术界对模糊语的界定存在很大的争议，许多讨论往往将其与概括、多义等现象混淆。其次，学术界关于模糊语在翻译中的处理方法的讨论有同质化的倾向，缺乏操作性。同时，学术界对模糊语与翻译的讨论多集中在文学体裁上，而针对医学等非文学体裁中的模糊语的处理则相对少见。本文为解决以上问题，从普遍为人所接受的一类模糊语现象的定义入手，分析了模糊语的一般特征并指出其根本特征是类属边界的不确定性。在此基础上，本文分析了模糊与概括及多义的联系和区别，并指出语言的模糊性往往与概括性相伴相随，而判断某语言单位体现的是模糊性还是概括性关键在于问题是否由于类属边界的不清晰引起。如果是，则属于模糊语的范畴，如果不是，则不属于本文所讨论的模糊语。在对模糊语做出明确的界定后，本文对医学英语中的模糊语进行了分类并对其特征进行了说明。与其它文体中的模糊语相比，本文认为医学英语中的模糊语的特征体现在以下三个方面：其信息功能强，是医学文体客观严谨的体现；为降低语言的模糊性，存在大量的硬性规定；与文学体裁相比，模糊语类属边界的可调查性更大。基于模糊语的一般特征和医学英语中模糊语的特征，本文提出在处理模糊语时应注意以下几点。首先，针对较大一部分的模糊语，译者的任务是根据汉语习惯选择与原文在类属边界上相当的模糊语。其次，针对部分模糊语在汉语中存在“一对多”现象，译者可通过平行文本搞清源语中模糊语的类属边界，选择恰当的模糊语并通过译者注的方式将源语中模糊语的可能性告知医学文献的使用者。最后，要分别注意语境和硬性规定对类属边界的影响。﹀
外文摘要：	︿ Discussions about fuzzy language have covered a wide range of subjects, including its definition, classification, functions and translation strategies against it. However, there are still problems that need to be solved. Firstly, the definition of fuzziness is still controversial, and many discussions blur its boundary with generality, ambiguity, etc. Secondly, discussions about strategies against fuzziness in translation are homogeneous and hard to apply. Thirdly, most of these discussions have been focused on literary genre, while discussions on non-literary genres, especially medical English, are quite limited. To solve these problems above, this thesis begins with the most commonly accepted definition of fuzzy language, analyzes the general characteristics of fuzzy language, and points out that its fundamental characteristic is indeterminacy of generic boundaries. Based on the definition above, the thesis analyzes the relationship and difference between fuzziness, generality, and ambiguity, demonstrates that fuzziness is inherent in generality, and the criterion to distinguish the two concepts is whether the problem is caused by indeterminacy of generic boundary. If it is, it belongs to fuzzy language; if not, it is not the fuzzy language discussed in this thesis. After the analyses above, the thesis classifies fuzzy language in medical English and describes its characteristics. Compared with fuzzy language in other styles, fuzzy language in medical English is highly informative to make the style objective and precise, rich in prescriptions to reduce the fuzzy nature of language, and more investigative in case of its generic boundary. Based on these features of medical English, the thesis proposes the following strategies against fuzziness in English-Chinese medical translation. Firstly, when dealing with most fuzzy language in medical English, the translator's priority is to find equivalent fuzzy language that shares similar generic boundaries. Secondly, when dealing with "one-to-many" fuzzy language in medical English, the translator after doing enough background research through parallel texts should choose the most likely counterpart in Chinese, and inform the readers of the possible generic boundaries of the fuzzy language through the translator's notes. Lastly, due attention should be paid to the influence of context and prescriptions on the generic boundaries of fuzzy language. ﹀
分类号：	H3
论文总页数：	195
参考文献总数：	35
馆藏号：	017/M2014(95)
公开日期：	2014-05-30

2014-05-29

结构化稀疏方法在语法自动纠错中的应用.李欢

链接

题名：	结构化稀疏方法在语法自动纠错中的应用
姓名：	李欢
学号：	1201210647
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-29
关键词：	结构化稀疏优化临近线性交换方向法结构化稀疏最大熵模型特征选择语法自动纠错
论文摘要：	︿本文的主要贡献包括：1)在前人已有的使用交换方向法（ADM，Alternating Direction Method）求解Basic Pursuit问题的结构化稀疏优化的基础上，扩展到求解任意损失函数的结构化稀疏优化，将结构化稀疏优化和自然语言处理领域中应用广泛的最大熵模型进行融合，提出了结构化稀疏最大熵模型，从而更加适用于自然语言处理领域。2)针对ADM在求解任意损失函数的结构化稀疏优化时需要内部循环求解子问题的不足，本文提出了pLADM（proximal Lineared ADM）算法，使得每个子问题具有封闭式解，简化了子问题的求解，与ADM需要双重循环的不足相比，pLADM只需要一重循环，从而提高了算法的运行速度，本文证明了pLADM具有全局收敛性以及的收敛速度，模拟数据实验和真实数据实验都验证了pLADM的运行时间是传统ADM的1/7。pLADM适用于混合正则化稀疏模型和任意交叠结构的结构化稀疏模型。3)本文将基于结构化稀疏方法的最大熵模型应用到语法自动纠错任务中，与传统的最大熵模型相比，在性能上取得了提升。﹀
分类号：	TP391.43
论文总页数：	58
参考文献总数：	30
馆藏号：	017/M2014(530)
公开日期：	2014-05-29

网络广告关键词的流量预估.张健

链接

题名：	网络广告关键词的流量预估
姓名：	张健
学号：	1201210983
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-29
关键词：	展现量预估点击量预估 Apriori算法一阶动态线性模型
论文摘要：	︿近年来，随着计算机的普及，互联网广告蓬勃发展，产生了一系列难题，这些难题成为工业界和学术界研究的一大热点。搜索广告是互联网广告中，比较常见的广告，它是指用户在使用搜索引擎搜索内容时，在结果页面上与搜索结果混合在一起显示的广告。为了达到广告主想要的收益，广告主通常需要在搜索推广平台上提交想要竞买的关键词，并设置关键词的出价、匹配方式以及投放地域等，以能够将广告显示给广告主寻找的那部分网民看，找到潜在客户，达到精准投放广告的目的。对广告主来说，不同的关键词、出价、匹配模式、投放地域等设置，能够达到的效果差别较大，广告主往往需要花费大量的人力、财力去人工的尝试出他们认为效率最好的广告投放设置。然而，随着近些年来使用搜索广告平台进行产品推广的广告主越来越多，投放力度越来越大，广告主的水平参差不齐，对能够自动化提示广告主相应的广告设置能够给他带来的效果的需求越来越强烈，广告主关注的一般是展现量、点击量、消费、排名等。目前市场上并没有成熟的工具可供广告主使用，且相关研究极少，本文尝试使用机器学习的方法来解决这个问题。本文主要研究的内容是根据广告的关键词、出价、匹配模式、投放地域，预估在该设置下的广告能够获得的展现量、点击量。由于有效特征极少，因此需要扩展相关特征。由于广告在不同匹配模式下，所能够被触发的查询词差别很大，且能直接影响最终的广告展现量，因此本文创新性的提出采用两步走的方法：（1）从日志中挖掘出，关键词在某匹配模式下所能够被触发的查询词。采用Apriori关联挖掘算法从日志中挖掘出查询词归一化表，对关键词的查询词列表进行归一化。（2）采用一阶动态线性模型，预估单个查询词的PV流量。结合（1）（2）计算得到关键词在某匹配模式下的PV流量。本文还试图从日志中，挖掘出广告主的竞争激烈程度等等信息以辅助最终的模型预估。最终，本文分别采用LR和GBDT模型来对展现量进行预估，并对二者的预估结果做一次融合，得到最终的展现量预估结果。对点击率的预估，本文采用LR模型先预估点击率，然后使用展现量与点击率的乘积得到最终的点击量。﹀
分类号：	TP39
论文总页数：	56
参考文献总数：	29
馆藏号：	017/M2014(552)
公开日期：	2014-05-29

基于言语行为的特征选择与分类.崔小薇

链接

题名：	基于言语行为的特征选择与分类
姓名：	崔小薇
学号：	1101210611
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-29
外文题名：	speech act classification and feature selection
关键词：	特征选择多核支持向量机多模态数据言语行为自动分类
外文关键词：	features selection multi-kernel support vector machine multi-modal data automatic classification of verbal behavior
论文摘要：	︿摘要自然人机交互技术是当前计算机应用技术研究的重要方向。语音识别是实现人机交互的重要途径,而言语行为识别正是人机对话和言语理解中一个十分重要的步骤。本文以人机交互为应用背景，系统分析了言语行为识别，特别是普通话言语行为识别的特点及存在的问题，提出了一种基于多模态特征选择和分类的言语行为识别模式，本文的主要内容概括如下： 1. 声学特征分析与选择。声学关联特征在言语行为识别研究中应用较少，本文在大规模中文口语对话标注语料库的基础上，分析了重音相关的声学特征在不同言语行为状态下的静态规律。同时进行了特征选择和分类实验，确定了重音相关的声学特征向量在言语行为识别领域的使用价值。 2. 语言学特征分析与选择。语言学关联特征是言语行为识别的关键之一。在大规模中文口语对话标注语料库的基础上，本文分析了语气词在不同言语行为状态下的分布情况，并对语气词相关的语言学特征进行了特征提取。将其与重音相关的声学特征进行融合后进行了特征选择和分类实验。证明了这种多模态特征组合能够提升言语行为识别的准确率。 3. 分类算法研究。针对特征数据多模态和分布不平衡的特点，在言语行为识别的分类算法方面，本文提出了一种改进后的多核支持向量机算法。相较于其他算法，改进后的多核支持向量机更为能够抓住多模态特性并充分利用其中信息，并且能够适应于不平衡数据的分类任务。考虑到数据的多样性，本文设计了10组实验，实验数据覆盖了二分类、多分类、单一模态和多模态数据类型。在实验中对比了传统支持向量机、基于代价敏感的支持向量机、多核支持向量机和改进后的多核支持向量机等分类器的识别性能。证明了改进后的多核支持向量机具有较好的识别性能。﹀
外文摘要：	︿ ABSTRACT Natural human-machine interaction technology is an important aspect of the current research on computer application technology, of which speech recognition is one major method to realize the human-machine interaction while verbal behavior recognition is one crucial step for man-machine conversation and speech comprehension. This paper makes a systematic analysis of the verbal behavior recognition, especially the characteristics and problems existing in the verbal behavior recognition of Mandarin, and proposes a verbal behavior recognition model based on multi-modal features selection and classification on the basis of human-machine interaction. The main contents of this paper are as follows: 1). Acoustic features analysis and selection. Acoustic associated features are less applied in the verbal behavior recognition study. Based on the large-scale Chinese spoken dialogue annotated corpora, this paper analyzes the static rules of stress-related acoustic characteristics under different verbal behaviors. In the meantime, the features selection and classification experiments are conducted to make sure the value in use of stress-related acoustic features in verbal behavior recognition field. 2). Analysis and selection of linguistic features. Linguistic-associated feature is one of the vital aspects of verbal behavior recognition. Based on the large-scale Chinese spoken dialogue annotated corpora, this paper analyzes the distribution of modal particles in different verbal behaviors and extracts the features of modal particles related linguistic features. And then features selection and classification experiments are conducted after the integration of stress-related acoustic features, which proves that this combination of multi-modal features can improve the accuracy of verbal behavior recognition. 3). Classification algorithms study. From the perspective of classification algorithms, this paper proposes an improved multi-kernel support vector machine algorithm aiming at the multi-modals and the imbalanced distribution of feature data. Compared with other algorithms, the improved multi-kernel support vector machine can better grasp the multi-modal features, fully make use of the information of it and can adjust the classified assignments of imbalanced data. Considering the diversity of data, this paper designs 10 groups of experiments, of which the experiment data embraces dichotomy, polytomies, and single mode and multi-modal data. In the experiment, the recognition performances of some classifiers are compared, including the standard support vector machine (SVM), support vector machine based on the price sensitive, multi-kernel support vector machine, improved multi-kernel support vector machine. And it is proved that the improved multi-kernel support vector machine can achieve a better recognition performance. ﹀
分类号：	H087/TP391
论文总页数：	76
参考文献总数：	73
馆藏号：	017/M2014(621)
公开日期：	2014-05-29

面向中文问答及对话系统的复述技术研究.张博

链接

题名：	面向中文问答及对话系统的复述技术研究
姓名：	张博
学号：	1101211049
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-29
关键词：	字幕语料复述识别单类SVM
论文摘要：	︿复述，是指相同语义在同一种语言内的不同表达方式。复述技术的研究主要包括复述句对识别、复述实例及模板抽取以及复述生成。其中复述识别可以看成是句子相似度计算的一种扩展，它用于识别同种语言的两句话表达的意思是否一致。由于没有明确的定义，复述识别应以语料为事实标准。复述识别在研究中通常作为评测任务，使用规模不可扩展的人工标注语料作为处理对象。但目前尚缺乏规模可扩展的、适用于应用领域的复述语料。例如在问答及对话系统中，用户的输入具有很大的随意性，并且偏口语化，这给问句分析带来很大的难度。如果问答系统有常见问题集（FAQ），理想情况下可以通过复述识别技术将用户的问句匹配到标准问题。然而目前缺乏适用于问答及对话系统的语料，即口语化的句子级复述语料，以及在这种语料上的复述识别研究。本文使用不同版本的影视剧集字幕翻译作为语料来源，从中获取质量较高的规模可扩展的句子级中文口语化复述语料。针对这种只有正例加少量噪声的语料，本文提取多种相似度指标作为特征，在该语料上训练单类SVM分类器，以判断复述句对。在特征提取中，使用了语义词典和复述词对抽取的方法改进识别效果，另外试验了词向量表示特征的方法。本文还对复述识别模块做了工程部署实验和并发测试，证明了本文的方法可以用于真实的工程项目。本文主要内容包括：1）对复述技术研究现状的综述；2）提出使用字幕语料有效获取句级复述语料的方法；3）对单类SVM分类器的研究和分析；4）多种特征的选择和提取；5）实验结果及分析。﹀
分类号：	TP391.1
论文总页数：	52
参考文献总数：	32
馆藏号：	017/M2014(653)
公开日期：	2014-05-29

英语作文介冠词自动改错研究.肖凤霞

链接

题名：	英语作文介冠词自动改错研究
姓名：	肖凤霞
学号：	1101210980
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-29
外文题名：	Research on English Determiner and Preposition Automatic Correction
关键词：	主动学习特征选择特征提取均衡采样自动改错
外文关键词：	active learning feature selection feature extraction balanced sampling automatic correction
论文摘要：	︿随着全球化的发展，英语学习越来越得到人们的重视，但由于英语和其它语言语法特点的巨大差异，语法错误层出不穷。其中，介冠词错误是非英语母语学习者最常犯的两类语言错误。过去几年，很多人对这一领域进行了研究，通用方法是使用多类分类器，根据上下文的特征集得到最终的分类结果。本文对介词、冠词自动改错分别进行实验，比较各个分类器及集成分类器的性能，确定最优分类模型，最大程度提高分类准确率。同时，在保留前人研究提出的特征和规则之外，本研究针对介词、冠词改错分别提出了新的特征和规则，并利用Filter 特征提取算法提取特征，确定最优特征。创造性的引入主动学习算法,在此基础上更新分类器。该方法解决了原来需要大量数据进行初始训练的问题。本文主要包括以下基本内容：首先，在总结了介冠词自动改错的研究背景和意义及国内外有关语料选择、特征选择、分类算法、改错模型构造的研究现状之后，提出了本文的研究内容和目标。其次，总结了特征提取和特征选择之间的关系，探讨了主动学习算法的应用和相关策略，并讨论了贝叶斯、基于规则以及决策树三种分类算法。第三，在保留前人研究提出的特征和规则之外，本研究针对介词、冠词改错分别提出了新的特征和规则，并通过实验验证了相关新特征的加入提高了介词、冠词自动改错准确率。同时，通过特征选择提取最优特征集。第四，针对介词和冠词分别测试三种类型分类算法及集成分类器的性能，确定最优分类器。第五，阐述作文介词、冠词自动改错的整个流程，设计并实现了介词、冠词的自动改错。第六，提出了两种新的主动学习算法，一种是基于偏置因子的主动学习，另一种是基于均衡采样的贝叶斯主动学习，旨在缩小样本规模的同时提高改错性能。同时，总结了介词、冠词自动改错当前存在的问题。﹀
外文摘要：	︿ With the development of globalization, people has placed more importance on English study. But because of the great differences between English and other languages, there still exist lots of grammatical errors, of which determiner and preposition errors are most frequent. During the past several years, people has done some research on determiner and preposition automatic correction. They usually took automatic correction as a process of classification. Based on the deep study of determiner and preposition automatic correction across the world, the paper compared different classification algorithm and then determined the optimal classifier to improve the accuracy rate of the classification. Meanwhile, This paper extracted new features and put forward new rules for preposition and determiner correction respectively, and validated whether the correction accuracy was improved after adding these new features and rules. In addition, selected the optimal feature set respectively for later preposition and determiner correction. Besides, The active learning algorithm was brought to update the classification. By this way, the scale of training set was significantly reduced. The main jobs of this paper were as follows: First, on the basis of conclusion of the background and significance of preposition and determiner correction, this paper discussed the issues of selecting and processing the classification corpus, extracting the features, determining classification algorithm and constructing correction module. Then the paper put forward a correlative solution. Second, summarized the relation between feature extraction and feature selection and discussed the application and strategy of active learning algorithm. Meanwhile, Native Bayes, Rule-based classification and Decision Tree were mainly demonstrated. Third, extracted new features and put forward new rules for preposition and determiner correction respectively, and validated whether the correction accuracy was improved after adding these new features and rules. In addition, selected the optimal feature set respectively for later preposition and determiner correction. Fourth, realized seven different classification module training on the same training set and determiner the best classification module for preposition correction and determiner correction. Fifth, described the whole process of determiner and preposition automatic correction, and realized the system in detail. Sixth, put forward two new active learning methods, one based on deviation factor, another based on balanced sampling. Meanwhile, compared the performance of different methods through relative experiments, and concluded the difficulties in determiner and preposition automatic correction. ﹀
分类号：	TP399
论文总页数：	75
参考文献总数：	38
馆藏号：	017/M2014(666)
公开日期：	2014-05-29

英文作文自动评分算法研究及系统实现.刘建阳

链接

题名：	英文作文自动评分算法研究及系统实现
姓名：	刘建阳
学号：	1101210800
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-29
关键词：	作文自动评分自然语言处理机器学习主动学习
论文摘要：	︿本文意在设计实现一套低成本、快速、稳定、易于使用的国人英语作文自动评分系统，解决英语教学过程中作文批改人力、物力资源消耗过大和主观因素造成的评分不一致等问题，降低英语作文批改压力、提高英语教学质量。作文自动评分即计算机利用自然语言处理相关技术对作文进行自动评分，首先将作文形式化表达为一组向量，然后利用已评分作文建立数学模型、归纳评分经验，最后将待评分作文形式化表达、利用数学模型进行预测评分。本文实现的作文自动评分系统包括特征抽取引擎和预测评分引擎两部分，前者负责作文形式化表示，后者负责评分模型训练和作文预测评分。特征抽取引擎实现一组作文特征抽取器，从语言学和统计学两个视角设计11类特征并辅以61张词表抽取共计231维特征，保证特征能够具有较强的描述能力和泛化能力。预测评分引擎在评分模型训练时使用Bayes、SVM、Logit和决策树四种机器学习模型，利用已评分作文对模型进行训练、参数调优，采用Kappa值对模型效果进行评比，并选取最优模型对待评分作文预测评分。本文在评分模型训练过程中引入主动学习方法，意在降低模型训练成本，对比基于不确定度、基于委员会和预聚类三种主动学习策略的效果；同时，本文基于Thrift架构对特征抽取引擎和预测评分引擎进行封装，使系统通用性和扩展性大大增强。实验表明，本系统具有较强的可靠性，其预测评分的Kappa相关性系数达到0.79；本系统引入主动学习方法，训练集规模降低约25%，有效降低系统使用成本；本系统通过稳定性测试和并发测试，具有良好的稳定性和较强的并发能力。﹀
分类号：	TP399
论文总页数：	66
参考文献总数：	0
馆藏号：	017/M2014(100)
公开日期：	2014-05-29

面向团购服务的推荐算法的研究与实践.洪春晓

链接

题名：	面向团购服务的推荐算法的研究与实践
姓名：	洪春晓
学号：	1201210598
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-29
外文题名：	Research and Practice of Recommendation Algorithm for Group Buying Service
关键词：	推荐系统团购冷启动数据稀疏
外文关键词：	Recommended System Group Buying Cold Start Data Sparseness
论文摘要：	︿团购是一种新兴的电子商务模式，它利用网络平台这个媒介，把具有同样需求和购买意向的消费者聚集在一起，增加消费者和商家的议价能力，使其享受优惠的购买价格。随着团购的快速发展，团购的商品规模迅速变大，出现了信息过载问题，需要推荐系统来解决。虽然推荐系统在电子商务网站中的应用已经比较成熟，但是团购和普通电子商务模式存在显著的差异：商品的生命周期短，更新速度快，商品集合每天都在变化，这导致了数据稀疏、商品冷启动问题很严重。由于团购服务数据的这些特点，导致常见的推荐算法并不能很好的解决团购场景下的推荐问题，因此本文致力于研究如何在团购场景下为用户推荐合适的商品。为了解决这个问题，本文做了相应的工作：1. 针对团购场景下的数据分布特点进行分析，证明传统的协同过滤方法难以提供高质量的推荐内容; 2. 针对TF-IDF不适用于基于短商品内容计算商品相似性的问题，提出了一种根据用户行为计算带权词典的方法; 3. 针对计算商品内容相似度时存在的语义问题，提出三种方法来解决; 4. 针对用户对于不同类型商品的兴趣衰减速度存在差异的问题，提出利用用户行为数据计算类目兴趣衰减函数的方法; 5. 针对冷启动问题及数据稀疏问题，提出三种方法来解决：第一种通过计算商品的内容相似表，然后使用ItemCF的思想进行推荐；第二种把用户行为从商品级别映射到词簇级别，然后通过Logistic Regression模型对用户在词簇级别上的行为建模，预测用户感兴趣的商品；第三种通过把商品聚类的方法，构建用户基本兴趣点，然后通过多种模型为用户在商品类簇上的行为建模，预测用户感兴趣的商品类簇。最终，本文通过Learing to Rank方法，把上述提出的模型进行融合、根据用户兴趣和商品内容建立一个排序函数，并通过排序函数为用户推荐用户最感兴趣的商品。﹀
外文摘要：	︿ Group buying is a new kind of e-business model, which utilizes the network platform to put together consumers that have the same demand and purchasing intention to increase their bargaining power which can make them enjoy preferential purchase price. With the rapid development of group buying, the goods of group buying become larger which leads to an information overload problem, so we need recommendation system to solve this problem. Although the application of recommender systems in e-commerce web site is comparatively mature, as a new e-commerce model, group buying differs explicitly from traditional e-commerce. In the scenario of group buying, the life cycle of product is short and the collection of products changes every day, which makes data sparse and leads to a serious cold-start problem. Due to these special characteristics of group buying, common recommendation algorithms used in e-commerce can not provide good recommendations in terms of group buying, so this paper is dedicated to research how to recommend suitable products for user in group buying. To solve this problem, our research focuses mainly on these aspects: 1. Analyse the data distribution characteristics in scenario of group buying, and prove why the traditional collaborative filtering method can't provide high quality recommendations as it has did before; 2. To solve the problem that TF-IDF is not suitable for calculating the content similarity of goods based on short content, we propose a method which can calculate the weighted dictionaries based on user behavior; 3. To address semantic problems when calculating the similarity of goods, we propose three methods; 4. To take into account the users' different interest attenuation in different types of products, we calculate attenuation function of categories of interest by using user behavior data; 5. In order to solve the cold-start problem and data sparse problem, we propose three methods: first, we obatain the table of goods content similarity through calculation, and then make recommendation based on the thoughts of ItemCF; second, we map user behavior from goods level to word cluster level, and then use logistic regression method to build a model on word cluster level’s user behavior data which is used to make personalized recommendation; third, we build up users’ basic interest through the goods clustering method, and then use a variety of methods to build models on user behavior data on goods cluster level so as to forecast goods cluster that user may interest. Finally, by Learing to Rank method, the models proposed above are emerged to create a ranking function based on user interest and goods content, and recommend most interested goods for users through that function. ﹀
分类号：	TP311.52
论文总页数：	81
参考文献总数：	63
馆藏号：	017/M2014(167)
公开日期：	2014-05-29

金融业网页采集系统的研究与实现.周舵

链接

题名：	金融业网页采集系统的研究与实现
姓名：	周舵
学号：	1001221172
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2014-05-29
关键词：	主题爬虫信息抽取 SVM kNN
论文摘要：	︿近年来，主题爬虫领域研究的焦点主要集中于以下两个问题： 1）如何提高主题爬虫主题判断和分类的准确性？ 2）如何提高主题爬虫搜索效率，增加爬虫的覆盖率？目前，主题网络爬虫的主题判断大多采用网页内容分析的方法。其中，最常用的算法主要有：VSM算法、朴素贝叶斯算法、SVM算法、kNN算法等。但这些算法的主题判断性能普遍不高，因此需要探索更好的解决方案。另一方面，目前的主题网络爬虫搜索算法，大多都属于局部搜索算法，如果能够从更广的网络空间中进行相关网页的搜索，无疑会显著提高主题爬虫的覆盖率。本文针对以上两个问题，主要做了以下几个方面的研究工作： 1）采用遗传规划算法寻找一个最优相似度计算函数，并将这个最优函数应用于kNN分类器，显著提高了主题爬虫主题判断的性能。 2）本文提出使用元搜索引擎，将查询得到的相关网页URLs不断加入到爬虫的抓取队列，引导主题爬虫爬向相关主题的网络社区，提高爬虫搜索效率和覆盖率，使得抓取到的网页集的质量提高。 3）本文给出了使用遗传规划和元搜索引擎混合的主题爬虫设计框架和流程，并且加以实现，可以准确高效地抓取主题领域的网页。本文最后对网页采集系统进行了详细的测试，测试参数综合考虑准确率和召回率，采用Macro F1参数。最终以多组实验样本数据，证明了采用遗传规划算法和元搜索引擎混合的主题爬虫，比单纯采用支持向量机（SVM）算法的主题爬虫，有更好的主题判断性能。﹀
分类号：	TP391.1
论文总页数：	57
参考文献总数：	0
馆藏号：	017/M2014(833)
公开日期：	2014-05-29

面向科技文献的语义标注平台的研究与开发.张子渊

链接

题名：	面向科技文献的语义标注平台的研究与开发
姓名：	张子渊
学号：	1101211096
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2014-05-29
关键词：	科技文献语义标注平台定制化管线
论文摘要：	︿语义标注平台可以有效的组织数字资源,从结构化的领域专业文本中提取所需要的知识片段。现有的语义标注平台面对的一个很大的挑战是它们往往难以改写以及重用程度较差。目前一些现有的语义标注平台在通用领域已经表现出较高的性能和较高的分析准确率，然而面向专业领域的语义标注平台技术现在还处于初级阶段。另外，科技文献处理任务也变得越来越多样化和多元化，如何迅速的解决处理任务也是一个亟待解决的问题。在现阶段，软构件技术在科技文献的语义标注系统上还未有效的应用。针对上述情况，本文开发了面向科技文献的语义标注平台,并且在此基础上,对专业领域语义的相关问题进行了深入研究和探讨，对科技文献处理任务，实现了定制化的管线处理，提出了模型构建的思想，同时对现有的切分词、词性标注、句法分析等工具在专业领域中进行了优劣对比，将它们开发成可重用部件嵌入到平台中，并进行了文本的可视化展示，主要研究内容为：（1）设计实现了面向科技文献的语义标注平台，拥有可重用的组件的框架和可视化的交互界面。平台的核心功能模块包括语料和文档模块，处理部件模块，定制化管线模块和文本可视化模块，并提出了实现平台的关键技术。（2）分析研究了基于自然语言处理的科技文献语义标注技术，包括切分词、词性标注、句法分析和篇章分析。通过分析相关语义标注系统和流程，实现了将科技文献切分词、词性标注、句法分析和篇章分析系统开发为可重用的部件，并将之嵌入了本平台。（3）在分析已有语义标注平台的基础上，给出并实现了面向科技文献语义标注的一种定制化的管线思想，除了能够进行科技文献处理部件的自由组合之外，还可以进行语料之间不同处理部件的优劣对比，解决了目前科技文献处理需求多样化，多元化的问题。（4）研究了专利文献的特征，发现依存句法分析可以有效的增加知识抽取规则，有利于我们抽取专利知识，对文本理解等领域起到了有益的作用。利用以上平台对科技文献处理部件进行测试对比试验，能够有效的完成部件测试对比工作，对研究人员来说是可以比较部分部件性能的测试平台，为更好地普及和应用面向专业领域的语义标注系统做了有益的促进。﹀
分类号：	TP311.52
论文总页数：	82
参考文献总数：	0
馆藏号：	017/M2014(110)
公开日期：	2014-05-29

2014-01-04

功能主义翻译理论和读者反应理论视角下的英文简历翻译实践.陈磊

链接

题名：	功能主义翻译理论和读者反应理论视角下的英文简历翻译实践
姓名：	陈磊
学号：	1001210538
论文语种：	chi
专业：	软件工程（二级学科）
公开时间：	公开
培养层次：	硕士
学位：	工学硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2014-01-04
外文题名：	English Resume Translation Practice under the Functional Translation Theory and the Reader's Response Theory
关键词：	英文简历转行求职功能主义翻译理论读者反映理论文体特征翻译策略
外文关键词：	English Resume functional translation theory reader's response theory stylistic features translation strategy
论文摘要：	︿经济全球化已成为当代世界发展的重要趋势，跨国公司已成为经济运行的主导力量，英文简历作为通用的简历形式也被广泛接受。而且随着信息交流的加快和深入，职场人士在工作中往往更容易接触到与本职工作相关的其他行业工作，人事部和著名的第三方教育数据咨询和评估机构麦可思的统计数据表明，中国人才市场的职业选择越来越丰富、越来越有活力，为职业人提供了更多、更丰富的机会，跨行业跳槽或转行已成为职场中新的趋势。为了成功应聘到目标岗位，一份好的转行求职简历无疑是十分重要的。笔者翻译了由Wendy S. Enelow 和Louise M Kursmark所编著的《转行者的专家级简历》一书。该书前三个章节讲诉简历写作的注意事项，后面的九个章节以简历范例的形式展示转行简历的具体案例，并附有其转行前后的职业变化及简历写作策略。笔者在翻译过程中对英文简历翻译实践中遇到的难点进行记录，完成翻译后对翻译实践以及国内求职简历的写作进行思考和研究，以上就是本文的写作缘由。笔者以英语母语者所写的简历为研究对象，首先总结了英文简历的文体特征。简历作为一种实用文体，简明精炼地记录了求职者的自我评价和工作经验，具有传递信息和说服招聘者的作用，在文体特征和翻译示例部分对这两个作用进行了针对性地阐述。笔者在语言文体特征方面，分别从书写特征、词汇特征、句法特征三个层面对英文简历的文体特征进行总结概述，明确了英文简历在文体特征层面的特性。其次笔者对英文简历中文译本的读者进行分析，从专业背景和文化背景两个方向研究译文读者即招聘人员，根据译文的预期目的和读者的反应确定翻译策略。之后笔者在功能翻译理论和读者反应理论的指导下，试图从常用词语、个人素质修饰形容词、专业术语、固定表达、长句和文化因素方面分析了英文简历翻译过程中文本的翻译难点并确定翻译策略。最后笔者对翻译实践中遇到的非翻译问题进行总结，分析国内和国外求职简历的差异。笔者认为西方国家职场人士的转行英文简历对我国求职者简历写作具有很大的指导和借鉴意义。在本文后面附录翻译对照中的每个简历翻译下面都有其写作策略，可供读者参考。在翻译实践中，译者应该以译文简历的信息传达和读者的反应作为评价标准，力求传达尽量完整的展示原文信息。因注重目标读者感受和反应，功能翻译理论和读者反应论特别适合指导简历翻译。在不影响译文预期功能和读者反应的情况下，应该以归化为主要翻译策略，在实际的翻译中灵活地运用多种翻译方法，确定最终译文。笔者希望借此研究为英文简历的写作以及简历写作教学提供参考和建议。﹀
外文摘要：	︿ As Economic globalization has become an important trend in the development of the contemporary world and transnational corporations have become a dominant force in the economy, English resume（CV）, as a general resume form, has been widely accepted by job applicants. In order to acquire the goal posts, a good resume is undoubtedly important. To allow job seekers in China a better understanding of the authentic resume writing, the author of the thesis translated the book titled Expert Resumes for Career Changers compiled by Wendy S. Enelow and Louis M. Kursemark, and recorded the difficulties and stylistic features in the process of the translation. The stylistic features, the functional features and the readers of the resume are the focus of this thesis. The stylistic features of the English resume consist in the format, lexicon and syntax. Its functional features refer to the responsibility that the resume, as a concise record of the applicant’s professional experiences and self-evaluation, assumes in communicating information and persuading human resources personnel for opportunities of a job interview. Then readers of the resume are analyzed with regard to their professional and cultural background. As for the strategy of translation, the purpose of translation and the reader’s response should be considered so that free translation is adopted as a main strategy. Special attention is paid to the translation of adjectives describing personal qualities, jargons and keywords, collocations, long sentences and cultural factors. The thesis also discusses through examples some translation techniques that are suitable and useful for resume translation. This thesis concludes that information transmission and reader’s response should be the prior criteria of resume translation. Translators should bear in mind that no rules are universal; they should adopt flexible translation strategies accordingly. ﹀
分类号：	H059
论文总页数：	229
参考文献总数：	24
馆藏号：	017/M2014(987)
公开日期：	2014-01-04

2013-11-30

《奥巴马族》一书中定语从句的翻译技巧.张敏

链接

题名：	《奥巴马族》一书中定语从句的翻译技巧
姓名：	张敏
学号：	1001210996
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-11-30
外文题名：	Translation Techniques of Attributive Clauses in The Obamians
关键词：	定语从句《奥巴马族》翻译技巧
外文关键词：	Attributive Clauses The Obamians Translation Techniques
论文摘要：	︿定语从句在英语表达中有非常广泛的使用，它不仅有限制、修饰或描述先行词等功能，有的时候还与主句之间存在着状语关系，说明原因、结果、条件、目的等。另外，由于英语属于“形合型”语言，具有多种连接词和连接手段，因此，定语从句还能不断地延伸扩展，在句子中的位置也灵活多变，加上中英文之间的语言差异，这些因素都增加了我们翻译的难度。本文对现存的定语从句的翻译技巧进行了梳理总结和深入的讨论，总结出前置法、后置法、溶合法、译成状语分句、插入法、拆译法、“C译法”等翻译技巧，并补充了“译成同位语”的翻译方法。另外，本文还结合定语从句的功能、中英文的主次信息安排特点和汉语的行文规范等总结出了7个翻译定语从句时应该注意的要点。最后，本文确定出《奥巴马族》一书的翻译标准，并选取出书中出现次数比较频繁、较有代表性的定语从句类型，包括作为作者个人评论的定语从句、介绍人物背景的同位语定语从句、多对象语境中的定语从句、介绍条约或规定内容的定语从句以及插入性质的非限制性定语从句等，对它们的翻译要点进行了介绍，并总结归纳出每种定语从句类型的比较好的翻译处理手法，力求使译文达到逻辑严整、层次分明的最佳效果。﹀
外文摘要：	︿ Attributive clauses are widely used in English, and they not only have the function to restrict, modify or describe their antecedent, sometimes but also have a kind of adverbial relationship with the main clause, indicating the cause, result, purpose, time, condition, concession, etc. An English sentence may be followed by an unlimited number of attributive clauses behind the word being modified, while a Chinese sentence allows only a limited number of words preceding the word being modified. Thus, there is no correspondence between their sentence structures. Moreover, English is a hypotactic language, while Chinese, paratactic, resulting in increased difficulty of the translation of attributive clauses. This paper summarizes the existing techniques to translate attributive clauses, including combination, division, mixture, translating into adverbial clauses and so on. And a new translation technique – translating into appositives – is proposed. Given the functions of attributive clauses, the characteristics of the arrangement of primary and secondary information in English and Chinese, and the features of the Chinese language, seven strategies for the translation of attributive clauses are put forward. At last, this thesis sets up the translation standard for The Obamians, picks the typical types of attributive clauses in this book for further discussion and offers the best translation techniques for each type of attributive clause, in hope of providing some reference for transltion of attributive clauses of political works. ﹀
分类号：	H059/H315.9
论文总页数：	184
参考文献总数：	28
馆藏号：	017/M2013(0845)
公开日期：	2013-11-30

英汉翻译中的语序调整——以《如何为傻瓜工作》汉译为例.韩华

链接

题名：	英汉翻译中的语序调整——以《如何为傻瓜工作》汉译为例
姓名：	韩华
学号：	1001210617
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-11-30
关键词：	英汉翻译语序语用功能
论文摘要：	︿语序是语法中重要组成的部分，它是语言单位进行组织排列的基本依据。英汉两种语言的语序有相似功能，都可以用于表达思想，传递信息。它们的不同之处在于，英语的语序更注重句子的形式结构，汉语的语序则更注重汉语的逻辑顺序，由于存在这种不同，就需要在翻译工作中对语序进行调整。根据迁移理论，由于社科类文本中存在“欧化”现象，译文语序在翻译工作中产生了负迁移。语序调整的目标是消除这种“欧化”现象，使译文的语序更符合汉语读者的阅读习惯。笔者在语序的三个方面，对语序调整的方法进行了讨论和总结。本文结合社科类书籍《如何为傻瓜工作》的翻译工作，对原文和译文的语序进行了对比分析，并分类讨论了在社科类文本中语序调整的方法，希望可以提高译文的质量，改善读者的阅读体验。本文在语用功能上对原文和译文语序对比分析的结论是：对于句子的主语，在原文的倒装结构中，句子的对象不是主语，译文的主语是明确的对象；对于句子的信息焦点，原文的信息焦点能够在形式上进行体现，译文的信息焦点需要依靠汉语的逻辑顺序来表达；对于感情的表达，原文中使用倒装结构表达强烈的感情，而译文中一般使用自然语序。语序调整的方法包括：在突出信息焦点时，译文中表达总结、目的、结果等语义信息的部分需要后置；在长句中表达感情时，要将主谓宾结构补充完整，并调整为自然语序；在名词性结构修饰中心词时，译文中将其前置，并按照汉语的逻辑顺序对语序进行调整，也可以调整为短句；在表达时间的定语、状语和从句中，译文中按照时间的先后顺序安排语序；在出现多个句子时，译文要对句子之间的逻辑关系进行分析，有递进、因果等关系的句子要根据汉语的表达习惯进行语序调整。本文结合翻译工作中的译例，对语序调整的方法进行了分析和验证，说明了方法的有效性，期望在社科类文本的英汉翻译中，对语序调整有借鉴作用。﹀
分类号：	H059/H315.9
论文总页数：	191
参考文献总数：	0
馆藏号：	017/M2013(0827)
公开日期：	2013-11-30

基于语料库与文本分析的《老子》英译本比较研究.牛佳玥

链接

题名：	基于语料库与文本分析的《老子》英译本比较研究
姓名：	牛佳玥
学号：	10917344
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
论文答辩日期：	2013-11-30
关键词：	《老子》英译本语料库文本分析 James Legge C. Spurgeon Medhurst Dwight Goddard Arthur Waley 林语堂刘殿爵
论文摘要：	︿本文选取了《老子》平均分布的不同时代的六个英译本，译者分别是James Legge，C. Spurgeon Medhurst，Dwight Goddard，Arthur Waley，Lin Yutang和D. C. Lau，通过语料库的方法从词汇、语法、语篇三个层面统计分析六个英译本的语言特征，并用文本分析的方法讨论这些语言特征所体现出来的译者的思想。 Legge的译本整体上倾向于物质性的客观性，原因在于他认为《老子》中包含着与达尔文的思想类似的进化性的自然主义；用词丰富，长句较多，且习惯用多个词表达原文的意义，这是由于他并不能确定自己是否完全理解了原文。Medhurst从他神秘主义者的角度来进行阐释，倾向意识和存在性，有意在用词和表达上纠正Legge的风格。Goddard的第一个译本从基督教的思想进行解读，并突出了人的因素；第二个译本则是基于佛教思想的阐释，从列出的例句来看比第一个译本用词更多，但是表达上更为清晰明确。Waley认为《老子》是论证性质的文章，呼唤和劝说的意味较多，用词丰富且从历史性翻译的角度追溯词源，导致某些词的译法与其他译本有大的差异。Lin从科学和自然主义哲学的角度突出人与世界的关系，在用词和表达上更自然无为。Lau认为哲学思想不可译，因此未使用专有名词，用解释性的阐述来描述；他认为《老子》是综合了多个流派的思想编纂而成的有关治国艺术的书籍，在用词和表达上都有体现。﹀
分类号：	H315.9/H085
论文总页数：	80
参考文献总数：	0
馆藏号：	017/M2013(0907)
公开日期：	2013-11-30

科普翻译中分译与合译的全信息视角——以《人类本性的科学》翻译实践为例.沈威杰

链接

题名：	科普翻译中分译与合译的全信息视角——以《人类本性的科学》翻译实践为例
姓名：	沈威杰
学号：	1001210808
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-11-30
关键词：	科普翻译全信息分译合译
论文摘要：	︿分译与合译是科普书籍英译汉翻译实践中两种常用的方法，具有化整为零和化零为整的效果。分译可以将英语中结构和内容杂糅的成分分而治之，化繁为简，使译文更具可读性和简约性；合译可以将英语中结构和内容零散的成分统筹规划，同样可以化繁为简，使译文更具整体性和连贯性。本文对分译与合译的研究包括三个部分：首先，由于当前译者对分译与合译含义的见解并不统一，因此笔者根据英语和汉语的句法单位结构序列说明了分译与合译的含义，并基于交际翻译和功能对等的翻译理论从全信息的视角分析了分译与合译在科普翻译中的作用和目标。从全信息视角看，词和句子的分合是语法信息层面的现象，而驱动分合的是深层的语义和语用信息，分译与合译的作用就是以读者为中心合理组织信息，分合的目标是能让读者更加容易地理解、接受、加工出语义和语用信息。接着，根据分译与合译的不同情况，笔者对提高译文的可读性和连贯性的具体分合策略进行了总结和探讨。最后，笔者结合教育心理学科普书籍《人类本性的科学》的翻译实践，分类讨论了分译与合译在科普翻译中的应用，以期译文能更加符合汉语读者的阅读习惯，从而提高普及科学知识的效率。笔者认为在科普英语书籍的翻译过程中：在词和词组方面，宜根据汉语表达习惯将集中的语义和语用信息分译为多个词、一个分句或者独立的句子，宜将语义信息重复或强调的词合译为对等的二字格或四字格词；在从句方面，通过分析主句和从句的逻辑关系，一般将修饰和限制关系的从句合译为句子的某个成分，比如定语、状语，将有评述关系和在主句基础上提供新信息的从句分译，并根据汉语的句法生成逻辑关系调整语序；在简单句方面，对句中的主语、宾语以及其它成分视汉语表达习惯进行分译处理，对有指代、因果等逻辑关系以及排比的简单句进行合译处理。﹀
分类号：	H059/H315.9
论文总页数：	162
参考文献总数：	0
馆藏号：	017/M2013(0836)
公开日期：	2013-11-30

2013-06-08

翻译视角下的双语词典研究与设计.方舟

链接

题名：	翻译视角下的双语词典研究与设计
姓名：	方舟
学号：	1001210585
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2013-06-08
外文题名：	A Study and Design of Bilingual Dictionary in the Perspective of Translation
关键词：	翻译工具翻译障碍交互式双语电子词典词义消歧
外文关键词：	Translation tools Translation obstacles Interactive electronic bilingual dictionary Word Sense Disambiguation
论文摘要：	︿翻译是翻译主体发挥主观能动性，将一种语言转换为其他语言的过程。然而，对于初级译员而言，可能会在翻译过程中遇到各种翻译障碍而不能顺畅地生成译文。即使生成译文，可能也会和符合标准的理想译文存在差距。本文想就造成译员实际译文与理想译文差距的成因，即翻译主体、客体以及工具引发的翻译障碍进行系统的梳理与分析，从成因入手，研究现有翻译工具，特别是作为翻译首要工具的双语词典在翻译应用中的不足。之后，对译员的需求进行调研，从而提出符合翻译需求的新型电子词典的设计构想与实现方法。本论文的根本目的在于，从翻译的角度，探讨改进翻译辅助工具的构想和方法，进一步加强信息技术与翻译的深度结合，利用信息技术弥补翻译主体的不足，在较短时间内，缩短实际译文与理想译文的差距，提高译文质量。本文共分为五章。第一章“提出问题”，在描述初级译员的实际译文与理想译文存在差距这一背景后，提出对双语词典的研究，包括双语词典的研究现状、研究意义、现有研究对于研发新型双语电子词典的启发、本文的框架、工作成果及创新点。第二章“分析问题”，主要对影响译文质量的成因进行梳理与分析，包括翻译主体引发的翻译障碍、翻译客体引发的翻译障碍、翻译工具引发的翻译障碍以及对于三者之间关系的分析，从而确定研发交互式双语电子词典是弥补现有研究和产品空白的首选方案。第三章“梳理用户需求”，包括对译员的翻译认知过程，使用词典的心理表征等方面的研究，并通过调查问卷的形式对译员需求做出具有针对性的调研，为新型双语电子词典的设计提供依据。第四章根据用户需求，提出交互式双语电子词典的设计要点与实现方法。第五章是结论与展望部分，总结本文成果与局限，并对未来工作进展展望。﹀
外文摘要：	︿ Translation is a process of initiatively transferring languages from one to another. However, beginners may encounter various obstacles that hinder them from translating smoothly during this process. Even if target translations are generated, gap may exist between translations by beginners and the ideal ones. In this article, the author aims at analyzing causes of the gap between translations by beginners and the ideal ones in the perspectives of the deficiencies of translation subjects (translators), translation objects(translation projects) and translation tools, especially one of the most popular tools, bilingual dictionaries. Based on these analyses, survey on translators and their requirements on translation tools are conducted in order to collect and provide evidences from users’ perspective for the design and implementation of a new type of translator-oriented interactive electronic bilingual dictionary. The purpose of this thesis is to create a new type of dictionary that can improve the current translation tools, especially one of the most frequently used tools, bilingual dictionaries, from the perspective of translation, strengthening the application of IT technology in the translation field, covering the deficiencies of translators and filling in gaps between translations by translators and the ideal ones in a short time. This article is divided into five chapters. Chapter One is to raise the necessity of researching bilingual dictionaries, one of the most frequently used translation tools, against the backdrop of gaps between translations by junior translators or translation students and the ideal translations. This chapter includes overall summary of the current researches on bilingual dictionaries, research purposes, inspiration on developing a new electronic bilingual dictionary, structure of this article, achievements related to this thesis and innovations in it. Chapter Two is to analyze the causes that affect the quality of translation from the perspectives of deficiencies of translation subjects, translation objects and translation tools, especially from the perspective of translation tools, based on which a thought of developing a new type of electronic bilingual dictionary is generated. Chapter Three is to research translators’ behaviors of translation and dictionary uses, and their requirements on translation tools by conducting a survey. Chapter Four is to demonstrate the design and implementation of a new type of translator-oriented interactive electronicbilingual dictionary with tests on the efficiency of this new dictionary. Chapter Five will be the conclusion of this article and the outlook of the future researches. ﹀
分类号：	H059/TP391
论文总页数：	75
参考文献总数：	61
馆藏号：	017/M2013(0106)
公开日期：	2013-06-08

技术说明书的易读性研究.杨涵舒

链接

题名：	技术说明书的易读性研究
姓名：	杨涵舒
学号：	10917510
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	技术说明书易读性判别易读性公式
论文摘要：	︿现代科技的发展让我们的生活变得越来越离不开各类机器、电子设备以及软件。自我学习如何使用以及如何排除故障，成为现代人的基本能力之一。技术手册（Technical Document）、技术说明书类文本（Instructional Texts）的作用正在于此，可以指导帮助人们完成某种操作、学会某种使用功能、掌握某种技巧。技术说明书的根本目的在于让读者能够理解文字所表述的操作内容。技术说明书扮演了如此重要的角色，那么研究人们如何能够判别其易读易懂的程度就是一件比较重要的事情。易读性的研究，特别是英语语言的易读性研究很早就开始了，但基本上局限于对于通用文本领域。本文试图扩张其研究范围，探索将传统研究模式用于英文技术说明书的易读性方面。本文首先进行了以语料库统计分析为根基的英文技术说明书的特征研究，发现了英文技术说明书类型文体在词汇层面和句子层面的特点。词汇层面从各种词性的语词分布入手，指出诸如名词中的专有名词数量，代词中的第二人称代词等等都有较高的出现频率，在句子层面则提出较少使用被动式，祈使句特别多等现象并指出了其出现的原因。在此基础上研究了可能对英文技术说明书的易读性产生影响的关键要素。本文随后以完型填空的测试形态获取了20种电子版英文技术说明书的易读性实证结果。在经历过泛泛选择研究变量导致的挫折后，以前人的易读性公式的适用性研究为突破口，增加自己发现的影响技术说明书易读性的判定变量，使用SPSS软件进行回归分析，最终得到了适用于技术说明书的易读性公式。本文的创新性除了基于语料库分析的英文技术说明书文体特点之外，在于综合了传统的易读性研究公式为基础，综合利用其他易读性公式的优点，在分析英文技术说明书文本特点的基础上，增加了多个变量，例如以句子相似度的计算挖掘技术说明书类文体的句法复杂性等，构建出适用于技术说明书领域的易读性公式。这一公式已运用于语言信息工程系正在研发的技术文档写作辅助系统中。﹀
分类号：	H087/H085
论文总页数：	79
参考文献总数：	49
馆藏号：	017/M2013(0603)
公开日期：	2013-06-08

虚拟翻译团队绩效问题研究.曹达钦

链接

题名：	虚拟翻译团队绩效问题研究
姓名：	曹达钦
学号：	10917087
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	虚拟翻译团队虚拟团队绩效绩效管理
论文摘要：	︿随着企业日益国际化，逐渐在世界各地开展业务，同时由于技术的发展和进步而导致对专业化知识需求的增强，翻译需求越来越多样化、专业化，而翻译需求的量也越来越大，翻译周期越来越短。因而，越来越多的翻译公司不再仅仅依靠内部的全职译员来完成翻译项目，而是逐渐开发、储备并利用外部的兼职译员或虚拟翻译团队来满足客户日益增长、复杂化的需求。然而，在此背景下，翻译公司对虚拟翻译团队却缺乏有效的管理方法和绩效考核。目前，翻译公司对兼职译员或虚拟翻译团队的考核普遍还仅限于翻译数量和译文质量两个维度，不考虑或者不够重视对翻译流程、团队内外部沟通过程、工作态度等方面的考核，导致在项目质量、进度方面频频出现事故，给翻译公司以及项目经理带来很大的挑战和困难，其中主要困难就在于：1）如何科学、可靠地选择虚拟翻译团队；2）如何全面地评价、考核虚拟翻译团队。本文旨在通过建立一套综合的绩效指标体系来解决第二个问题。本文基于传统的虚拟团队绩效影响因素模型，通过问卷及深度访谈的研究方法得出适用于虚拟翻译团队的绩效影响因素模型；然后在此模型的基础上，通过绩效领域常见的定性及定量方法得出适用于虚拟翻译团队的绩效指标体系，并通过具体的案例进行实证研究，证实该体系能够有效用于提高虚拟翻译团队的质量、速度及客户满意度；最后通过层次分析法，将该指标体系用于支持翻译服务采购决策，说明其在翻译服务采购过程中的作用。﹀
分类号：	F276.44/TP311.1
论文总页数：	72
参考文献总数：	19
馆藏号：	017/M2013(0573)
公开日期：	2013-06-08

昆曲翻译与英文诗歌的互文性——以李林德《牡丹亭》译本为例.卢伟

链接

题名：	昆曲翻译与英文诗歌的互文性——以李林德《牡丹亭》译本为例
姓名：	卢伟
学号：	1001210755
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-08
外文题名：	The Intertextuality of Kunqu Translation and English Poetry---A Case Study of Lindy Li Mark`s translation of The Peony Pavilion
关键词：	昆曲翻译英文诗歌互文性牡丹亭李林德
外文关键词：	Kunqu Translation English poetry intertextuality The Peony Pavilion Lindy Li Mark
论文摘要：	︿昆曲是中华民族的文化瑰宝之一，随着昆曲越来越频繁地走上世界舞台，对昆曲的翻译以及翻译研究的需要也越来越迫切。从互文性角度研究昆曲翻译才刚刚起步，研究角度也还停留在源语在源语文化中的互文性研究，没有从译语在译语文化的互文性角度，以西方观众对诗歌的接受为基准审视我们自己的翻译存在的问题。笔者将从广受好评的白先勇青春版《牡丹亭》的翻译——李林德的舞台译本为例，运用文本细读、互文性批评和读者反应理论等研究方法，通过分析昆曲翻译和英文诗歌的互文性，在译文中提炼出大量英诗的互文本痕迹，得出观点：重申“译诗像诗”的诗歌翻译观，汉诗英译就要译得像英诗，符合译语观众的审美接受。本文将互文性批评应用到翻译研究中，主要是在译文中寻觅到英诗的踪迹，这些相似的笔法构成了译文和英诗的互文性，使得汉诗译得具有英诗的感觉，从而得到译语观众的好评。从互文性的角度来探究翻译，总是能在译文的字里行间，评踪辨影，找出译语文化、译语文学作品的互文性痕迹。本文将用大量的译例详细分析译者如何在翻译过程中，调动一切互文性知识储备，在源语文化和译语文化中编织互文性的大网，更多地要在译语文化中寻找潜伏着的无数的织体，进行跨文化的互文性转换，让译语观众、读者也能深入其境，进行一场跨越时空的观演和阅读体验。全文分七章，第一章介绍昆曲、互文性、李林德《牡丹亭》译本的背景知识和论文的选题意义、研究方法。第二章文献综述梳理了诗歌翻译、互文性角度翻译研究和《牡丹亭》英译的研究综述。三、四、五、六章分别从互文性角度对译文和英文诗歌的诗性词汇、音韵格律、句法和文化词汇进行详细分析。第七章余论，归纳总结论点汉诗英译须像英诗，并对进一步的文化过滤研究提出思考。﹀
外文摘要：	︿ Kunqu Opera is one of the cultural treasures of ancient China. As more and more Kunqu plays are being frequently performed on the worldwide stage, the Kunqu translation and Kunqu translation studies become more and more important. The research on Kunqu translation in the intertextual perspective has just started. The point of view is still stuck in researching intertextuality between the source language and the source culture, not from researching intertextuality between the target language and the target culture or from the perspective of a western audience’s acceptance of poetry as the benchmark by which we examine our own translation problems. The author of this thesis will for example take the widely acclaimed Lindy Li Mark’s stage translation of The Peony Pavilion for Pai Hsien-yung’s young lovers’ edition to analyze the intertextuality of Kunqu translation and English poetry. With the research methods Close Reading, Intertextuality Criticism and Reader Response Theory, this thesis extracts in a large number of intertextual traces of English poetry in the translation. These analyses draws to the conclusion: reaffirming the old concept “translated poetry should sound like poetry” and come to my own point of view: the translation of Chinese poetry should sound like English poetry and consider the aesthetic acceptance of the target language audiences. This thesis applies Intertextuality Criticism to translation studies, mainly by finding the traces of English poetry in Lindy Li Mark’s translation. This similar brushwork betrays the intertextuality between the translation and English poetry, endows the poetic feeling of English poetry to the translation of Chinese poetry and thus get a lot of acclaim from the target language audience. To explore translation from the perspective of intertextuality, I can always find many clues and traces of intertextuality between target language culture and literary works in the lines of translation. This thesis uses a lot of detailed translation examples to analyze how the translator mobilizes all reserves of knowledge and weaves a large intertextual net of the source language culture and the target language culture, mainly how the translator finds the lurking myriad textures in the target language culture and conducts cross-cultural intertextuality conversion, so that the target language audience and readers can across time and space enjoy an enchanting performance and reading experience. The thesis totally has seven chapters. The first chapter introduces background of Kunqu opera, the theory of intertextuality, Lindy Li Mark’s translation of The Peony Pavilion, and the significance of this topic and research methods. The second chapter is literature review, including three parts: literature reviews on the poetry translation, on translation in the perspective of intertextuality and on the translation of The Peony Pavilion. The next four chapters fully analyze the intertextuality of the Kunqu translation and English poetry in the poetic vocabulary, rhymes and metrical patterns, syntactical level and cultural vocabulary. Chapter Seven concludes that the translation of Chinese Poetry should sound like English poetry and mention the phenomenon of cultural filteration in Pai Hsien-yung’s young lovers’ edition. ﹀
分类号：	H059/I236.53
论文总页数：	65
参考文献总数：	107
馆藏号：	017/M2013(0227)
公开日期：	2013-06-08

殖民游记作品中殖民话语的翻译策略——以Tales of Travel All Around the World一书的翻译为例.邵晶晶

链接

题名：	殖民游记作品中殖民话语的翻译策略——以Tales of Travel All Around the World一书的翻译为例
姓名：	邵晶晶
学号：	1001210804
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-08
外文题名：	Translation Strategies of Colonial Discourse in Colonial Travel Literature —— a Case Study of Tales of Travel All Around the World
关键词：	殖民游记殖民话语后殖民翻译研究翻译策略
外文关键词：	Colonial Travel Literature Colonial Discourse Postcolonial Translation Studies Translation Strategies
论文摘要：	︿很多殖民游记作品有着较高的文学价值和历史价值，在国内也拥有很多读者，值得翻译为中文。然而在翻译过程中，此类传记强烈的殖民色彩，体现强势文化对弱势文化的霸权心理，具有很强的侵略性。本文作者认为聪慧的读者应能对其进行批判式阅读，包括对其进行抵抗与消解。但翻译的原则应以尊重原文为前提，真实呈现文本本来面目，不多译也不少译，不美化也不丑化。但这不等于说译者因此完全放弃自己对文本的解读，他/她可通过译注的方式表达个人见解，从而帮助读者对原文形成客观的批评认识。本文以Tales of Travel All Around the World一书的翻译为例，说明如何在翻译中使用以上翻译策略，并对该原则进行了解释和论证。﹀
外文摘要：	︿ Colonial travel literature, with considerable value in literature and history and a wide readership in China, is worth translating into Chinese. However, its strong colonial discourse characterized by aggression and hegemony of the colonizing to the colonized poses a challenge to the discerning translator. The translator of Tales of Travel All Around the World as well as author of this present report believes that readers, clever as they are, are able to read critically, including resisting and deconstructing the colonial discourse if need be. Therefore, translators should respect the source text and reveal the true content of it without amplification or omission, beautification or vilification while preserving his/her opinions by annotating the target text so as to help readers to read objectively and critically. Taking the translation of Tales of Travel All Around the World as an example, this report demonstrates the above principle. ﹀
分类号：	H059/K207.8
论文总页数：	196
参考文献总数：	16
馆藏号：	017/M2013(0263)
公开日期：	2013-06-08

基于语料库的技术文档模糊限制语使用研究.何京燕

链接

题名：	基于语料库的技术文档模糊限制语使用研究
姓名：	何京燕
学号：	10917183
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	语料库技术文档模糊限制语传播目的语义语用
论文摘要：	︿本研究主要通过基于语料库的研究方法，揭示出技术文档中模糊限制语的语法分布特征和语用功能特点。研究中模糊限制语的定义范畴为限制语义的模糊程度，影响或修正命题内容真值程度，或表现不同命题态度的词、短语或相应的表达。本研究中构建了技术文档语料库和科普语篇语料库两个语料库，前者是主要研究对象，后者则是对比语料。本文根据模糊限制语的现有研究，进行分析和统计，构建了一个模糊限制语语法表，其中将模糊限制语分为：情态动词、动词、名词、形容词、副词、词组和从句七大类，各大类又划分为几个子类。研究方法主要是基于语料库的定量和定性分析。对技术文档中模糊限制语的语法特征进行了定量和定性分析，得到了各类中单词、词组和表达的使用频次，并计算其使用频率百分比。同时，通过 t 值检验了两个语料库的样本差异；采用卡方检验揭示了各类模糊限制语之间的差异显著性。对技术文档中模糊限制语的语用特征进行了定量和定性分析，揭示了模糊限制语如何帮助技术文档实现其传播目的。本文中将技术文档的传播目的归结为四种：帮助用户有效使用产品、规避企业信息传播风险、宣传产品以及维护企业和用户之间的关系。对于语用功能的分析，则采用例句“实证性”分析和说明，从例证中归结整理出技术文档中模糊限制语用策略的特征。研究发现，1）技术文档与科普语篇虽同属于科技文本，因传播目的的不同，在模糊限制语的使用上存在明显的差异；2）技术文档中模糊限制语的用词和表达上均较科普语篇更为简单和单一；3）技术文档中常使用情态动词类和从句类模糊限制语；4）技术文档在模糊限制语的使用上，体现出文体更强的客观性和“单声”所显示的“权威性”。同时，为达成不同的传播目的，文档作者会采用不同的模糊限制策略。模糊限制语的使用，有效地帮助了技术文档实现其传播目的。﹀
分类号：	H087/TP311
论文总页数：	89
参考文献总数：	70
馆藏号：	017/M2013(0581)
公开日期：	2013-06-08

中外政府网站招商引资文本对比研究.范平

链接

题名：	中外政府网站招商引资文本对比研究
姓名：	范平
学号：	10917146
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-08
外文题名：	A Contrastive Study of Investment-inviting Texts on Chinese and Foreign Government Websites
关键词：	招商引资文本语料库语体分析母语思维负迁移
论文摘要：	︿招商引资工作的目的是为了吸引国内外投资，以带动地区经济的发展。因为其关系到国计民生的发展，这是很多国家政府工作中的重要环节，与之相关联的招商引资文本在政府网站上也就很常见了。对比国外，我国政府网站中的英文招商引资文本质量参差不齐，存在较多缺陷，例如，信息不够完整，语言质量不高，中式语言严重，翻译味重等等。这样的文本可读性低，很难引起投资者的投资兴趣，致使对外招商引资宣传的力度大打折扣，甚至带来负面影响。因为在中国政府网站上的英文文本受众只可能是外国人，让我们的文本趋向于外国读者的“口味”，这是该类文本写作的必然选择。有鉴于此，本文选择了对比分析国内外同类型文本这一研究路径作为研究出发点。其次，本文试图从母语思维对翻译写作的影响这一角度，对招商引资文本的写作和翻译问题进行研究。本文首先收集、自建中外政府招商引资文本语料库，通过语料库软件，对二者进行统计分析，总结得出中外政府招商引资文本的差异。本研究发现，国内的文本较为正式，语言召唤性弱；而国外的此类文本与之相比，呈现出正式性较弱，而语言召唤性强的特点。在母语思维的研究方面，本文采用语料错误标注的方法，对我国政府英文网站上的招商引资文本做出先导性分析，发现了母语负迁移现象在此类文本中较为突出层面，于是本文参考其他学者的相关工作，结合中西思维的差异，深入分析和讨论这些现象。本研究认为，我国政府英文网站上的招商引资文本，重“宣传”轻“传播”，将语言受众预设为“谨慎的投资者”，翻译味道浓重。文本传达信息的功能多于感染召唤的功能，虽然能满足信息传播的要求，但是未必符合国外受众的阅读习惯，可读性差，不一定能实现招商引资的目的。国外的招商引资文本，将语言受众预设为“投资合作伙伴”。语言亲切，多用第一、二人称，语言互动性强，可读性高。在满足信息传播功能的同时，能较好地实现招商引资的目的。最后，在研究展望中，本研究针对二者的差异，提出我国政府招商引资文本的生成，应当采用“改写”而非“翻译”的策略。建议在保持“中国特色”的原则下，对中文原文适当改写，使之能够吸引国外读者。﹀
分类号：	H087/TP393.092
论文总页数：	73
参考文献总数：	41
馆藏号：	017/M2013(0577)
公开日期：	2013-06-08

《加菲猫》漫画的翻译研究.李蕾

链接

题名：	《加菲猫》漫画的翻译研究
姓名：	李蕾
学号：	10817203
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	漫画特点翻译策略加菲猫
论文摘要：	︿很多中国本土译者在翻译漫画的过程中，单纯按照出版和传播的本能进行翻译，缺乏理论和方法论的指引，翻译质量参差不齐。本论文以《加菲猫》漫画的翻译为研究对象，研究漫画的结构特点、图文关系如何影响翻译，探讨漫画出版对漫画翻译过程的影响，提高译者对漫画的理解，分析总结漫画翻译的特点，提出相应的翻译建议策略。加菲猫漫画翻译的主要目的就是要保持漫画的娱乐性功能，这一点是漫画翻译不同于其它领域翻译的最主要特征，也是奈达的功能对等理论能够在高层次上指导漫画翻译的原因所在。由于中国传统翻译理论大都是针对散文、诗歌、小说等传统文学翻译提出，然而这与漫画翻译有较大差距，使得这些理论难以直接应用于漫画翻译。本文提出，在漫画翻译过程中，不仅要以对等翻译理论为指导，更要结合漫画翻译独有的特点，不能忽略掉漫画艺术和出版技术所带来的特殊性。论文讨论了漫画翻译涉及到的各个方面的独特之处，并以《加菲猫》漫画的翻译为例进行分析。笔者认为，可以把漫画翻译过程看作是本地化过程，并提出，在翻译前和出版商和译者共同决定使用异化策略或归化策略的尺度，不管文字还是视觉内容都要根据目标文化的要求进行调整。译者不仅要意识到漫画翻译有其独特之处，更要理解漫画。漫画翻译有其自身的特点，既不同于传统的文学翻译，也不同于普通的应用文档翻译，因为漫画的文字与绘画紧密相关，和日常生活紧密相关，也和文化、风俗紧密相关。译者通过系统研究漫画中文字与图画各自的特点以及二者是怎样相互影响的，从而更好地理解漫画。只有深刻理解了漫画，才能做好漫画翻译，翻译质量也才有保障。﹀
分类号：	J218.2/H059
论文总页数：	76
参考文献总数：	0
馆藏号：	017/M2013(0570)
公开日期：	2013-06-08

目的论视角下《盗墓笔记》英译研究.汪林

链接

题名：	目的论视角下《盗墓笔记》英译研究
姓名：	汪林
学号：	10917416
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-08
外文题名：	On the English Tranlation of Cavern of the Blood Zombies from the Perspective of Skopos Theory
关键词：	目的论《盗墓笔记》目的法则连贯法则忠实法则
外文关键词：	Skopos Theory Cavern of the Blood Zombies Skopos Rule Coherence Rule Fidelity Rule
论文摘要：	︿ 2011年，《盗墓笔记》第一卷的英译本在全球推广发行，尤其是在美国深受年轻读者的欢迎，也标志着中国的网络文学逐步走向世界。网络文学的兴起，为盗墓文学的出现提供平台，盗墓文学具有悬念性、传奇性、惊悚性的特点，使其迅速吸引了大批读者的关注，并随之成为了一种流行文化，随着在网络这个绝佳的传播媒体上迅速推广，中国式的盗墓文学必将走向世界，而对于盗墓文学的翻译也必将成为关注的焦点。在翻译盗墓文学的过程中，采用何种翻译方法和策略，才能恰当地将小说承载的独特的文化准确地传达给国外的读者，这是一个难点。本文将会采用描述性翻译研究方法和原文译文对比分析研究方法，对《盗墓笔记》英文版的描述性翻译研究，通过中文版和英文版小说的对比研究，系统地探索了Kathy Mok在翻译过程中采用的翻译方法。作者将会首先对《盗墓笔记》的翻译纲要进行分析，并根据目的论的三大法则分别对《盗墓笔记》英译本中的翻译方法和翻译策略进行分析。在目的法则下，为了分析该英译本是否实现了传播中国文化的翻译目的，作者从读者的角度对《盗墓笔记》英译版本中文化负载词的翻译进行分析，并总结作者在实现传播翻译目的时采用的翻译方法和翻译策略；在连贯法则下，作者利用语料库语言学对比了《盗墓笔记》中英文版词汇衔接情况以及语序转换情况。最后，在忠实法则下，对《盗墓笔记》英文版本中改写部分进行分析，总结改写的内容特点，并进而探索其改写产生的原因。本文的创新之处主要有两点：1. 在目的论的基础中，灵活应用其他研究方法，丰富目的论的研究范围；2. 在过去目的论研究罗列例子的分析方法基础上，将语料库语言学的研究方法引入其中，通过具体的数据展现《盗墓笔记》英译本的情况。本文通过对《盗墓笔记》翻译策略和翻译方法的研究，得到以下结论：在文化方面，译者保留了原文中大部分的文化信息，采用直译、意译、直译加解释、替换等多种翻译方法进行翻译，较好地实现传播文化的翻译目的；在语言方面译者充分考虑到汉语和英语的不同，通过逻辑连接词和语序的调整创造出符合英语语言习惯的作品，使西方读者更好的理解和接受；另外，为了实现原文的娱乐功能，译者保留了历史与探险的情节，对一些次要情节进行了改写。目的论的评价标准提到：只要满足目的论的三个原则，那么译本就是充分的。从上述的研究可以看出，《盗墓笔记》的英译版本是符合目的论的三个原则的，虽然存在一些问题，但是瑕不掩瑜，《盗墓笔记》英译版是一部成功的译本，其使用的翻译策略和翻译方法可为日后的盗墓文学翻译提供参考。﹀
外文摘要：	︿ The English version of Cavern of the Blood Zombies was published to the whole world in 2011. Soon it became highly popular among the American readers, which means Chinese-style network literature is going onto the world stage. Due to the development of internet, the network literature can find a stage to show itself. With the features of suspense, legend and horror, the grave robbery style has quickly attracted many people's attention among these literature works. With the help of internet, the grave robbery literature will be a focus hot in the world, so does its translation. During the translation process of grave robbery literature, it is difficult to choose the proper translation methods and strategies to pass the unique culture information to the foreign readers. In this article, by taking descriptive approach and contrastive analysis on the English version of Cavern of the Blood Zombies, the author is to study systematically the translation methods of Kathy Mok. Firstly, the author will find out the translation brief of Cavern of the Blood Zombies. Secondly, depending on the Skopos Theory, the author will investigate translation methods and strategies Kathy Mok used of Cavern of the Blood Zombies. On Skopos rule, the author took the culture-loaded words as the point, and judged whether this English version achieved its translation object -- to convey Chinese culture; on coherence rule, with the help of corpus, the author will do contrastive analysis between the Chinese version and the English version of Cavern of the Blood Zombies from perspective of lexical cohesion. Finally, on the fidelity rule, by analyzing the rewrite part, the author concluded the features of rewrite part and investigated its reason. The innovation points of this paper are: 1. The combination of Skopos Theory and other methods, which enlarges range of this study; 2. The corpus method is introduced into this article, which can totally show the translation through detailed data. From the culture perspective, the translator kept most culture information in the original version, she took several methods such as literal translation, free translation, literal translation with explanation and replacement when translating, which perfectly achieved the translation object -- convey Chinese culture; to the language part, considering the difference between Chinese and English, the translator adjusted the cohesion and order of the words so that the foreigners can better understand the story; In addition to, for the entertainment function, the translator kept the historical and adventures part, and did some rewrite for necessary. The evaluation criterion of Skopos Theory said: if the translation has met the three rules of Skopos Theory, it is a good translation. From the above, the English version of Cavern of the Blood Zombies has met the three rules of Skopos Theory, so it is a successful translation which can provide a reference for other grave robbery literature translation in the future. ﹀
分类号：	H315.9/H059
论文总页数：	66
参考文献总数：	58
馆藏号：	017/M2013(0596)
公开日期：	2013-06-08

帮助学习者英语表达的英汉电子学习词典的设计与实现.赵梦初

链接

题名：	帮助学习者英语表达的英汉电子学习词典的设计与实现
姓名：	赵梦初
学号：	1001211021
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	英语表达英汉电子词典产出模式词汇语义网络
论文摘要：	︿当前社会各方面快速发展，对于英语交流需求也愈加迫切，然而由于中国传统应试教育仍然以阅读为核心，中国英语学习者的产出性词汇缺乏，因此英语表达过程经常会遇到不同的困难：一是对于想要表达的意义只能想到一个母语概念，无法想到与之相对应的英语表达；二是能想到表达意义的基本英语词汇或表达，然而缺乏更准确或更生动的其他词汇或表达；三是即使判断使用了正确的词汇选项，由于对词汇知识了解不够全面，仍然无法进行最终地道或正确的表达。而词典作为帮助学习者的重要工具，需要摆脱传统功能的束缚，从解码型词典过渡到编码型词典。为了解决上面的三个问题，本文探讨了不同学者对于产出性词典模型的理论说明和实验证明，认为帮助学习者英语表达的英汉电子学习词典必须包括两种产出方式，即从源语搜索开始和从目的语搜索开始，对应汉英模式和英汉模式。虽然当前市场已经有同时具备这两种模式的电子词典，如金山词霸、有道词典等，然而这类词典的并不能完全满足学习者的需求，而且缺乏对于英语表达需求的针对性。因此本文分别从这两个模式分析：汉英模式以前人的文献和汉英词典为基础，通过具体翻译实例的分析，提出新汉英模式的原则；英汉模式以解决词汇扩展和词汇使用为目标，试图构建新的词汇语义网络模型来指导词典编纂，通过与五大单语学习词典进行比较，总结出英汉模式的词典模型。然后结合分析，分别对新的模型进行词条举例说明，完成新词典的原型设计，最后用案例调查分析新词典内容的产出性。前人的研究一般都只单独处理一个问题即汉英模式或英汉模式，缺乏研究的整体性；以往词汇语义网络的研究侧重二语习得的角度，无法直接应用于帮助英语表达的词典模型；而且从未将对学习者有重要影响的词块引入词汇语义网络中帮助词典编纂。本文正是从这些出发，探讨了英汉电子学习词典的产出内容和构建方法，提出了具体的编纂建议和实例。﹀
分类号：	TP311.1/TP18
论文总页数：	103
参考文献总数：	56
馆藏号：	017/M2013(0431)
公开日期：	2013-06-08

不同媒体下的语言特点及对应翻译策略的研究——以《魔戒》为例.张昀

链接

题名：	不同媒体下的语言特点及对应翻译策略的研究——以《魔戒》为例
姓名：	张昀
学号：	1001211011
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	媒体传播语言翻译策略
论文摘要：	︿随着媒体技术的发展，文化的呈现方式和传播方式都受到了影响。这给翻译，特别是多媒体的文本翻译带来了前所未有的挑战。《魔戒》存在多种媒体形式，如小说、电影和网络游戏，本论文就以《魔戒》三种媒体中的文本为对象，从语言特点与传播特点两个角度来研究不同媒体的翻译特点。在语言特点方面，分别从词汇、句子、语篇三个层面进行了统计学分析。在词汇层面，统计和分析了类符数、形符数、标准化类符/形符比、词汇密度、词长分布。在句子层面，统计和分析了平均句长。在语篇层面，统计和分析了连接词的使用情况。最后得出这样的结论：原著小说用语最为灵活，表达的感情也最丰富；游戏在用语方面更为严谨；而电影更倾向于采用短句子。因此，小说翻译往往比电影和游戏的文本翻译更为频繁地使用较长的句子和逻辑连词。在传播特点方面，论文主要以《魔戒》的三种承载媒体（小说、电影和网络游戏）为研究对象，从受众约束和媒体约束这两个方面解释了不同传播媒体对翻译的影响。受众约束主要包括受众接受，受众层次和审美期待；而媒体约束则分为文本的约束，空间，时间的限制和交互约束。这些不同方面的约束造成了不同媒体间的翻译差异。通过研究，笔者得出结论：传统的语言功能理论已经不能完全满足小说、电影、游戏等多种媒体对翻译的要求，因此媒体的翻译过程有必要考虑媒体的传播特点与审美标准。﹀
分类号：	H059/H315.9
论文总页数：	75
参考文献总数：	62
馆藏号：	017/M2013(0422)
公开日期：	2013-06-08

基于客户投诉的酒店职业英语培训设计.王芳

链接

题名：	基于客户投诉的酒店职业英语培训设计
姓名：	王芳
学号：	10917425
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	专门用途英语酒店职业英语在职培训培训课程设计
论文摘要：	︿随着我国综合国力的提高，来我国入境游的外国游客日益增多，我国酒店行业迎来了大好的发展契机。然而酒店服务人员的英语口语能力却并不令人满意。课程设计人员如何设计酒店英语口语培训以便提高酒店员工的英语口语能力，从而提升酒店的服务质量，使酒店获得更大的发展，是当前很多酒店培训部门急需解决的课题。酒店英语教学研究已经在我国高校广泛开展，但是针对酒店在职员工的英语口语培训的研究并不普遍。不少酒店在对员工进行培训时，照搬大中专院校的酒店英语课程设置，以至于培训的效果甚微，影响了酒店的进一步发展。本文首先简要介绍了研究背景、研究问题、研究过程中所使用的研究方法以及研究思路。其次笔者对专门用途英语课程设计理论进行了简要回顾，并在前人工作的基础上，重点分析了酒店行业多数英语培训项目在需求分析、教学大纲设计、教材选用、教学方法运用和语言测试设计五个方面所存在的问题。本文以北京酒店行业若干基层员工为研究对象，通过对酒店外宾进行访谈、搜集外宾的投诉文本以及对员工学习需求进行问卷调查，提出了基于客户投诉的酒店英语培训设计，并设计出一套完整的培训课程体系。同时，本文将理论付诸于实践，设置实验班和对照班，开展教学实验，并对实验结果进行了具体分析。本研究是将专门用途英语课程设计理论与酒店英语培训实践相结合的有益尝试，同时本研究对于酒店培训部门设计英语口语培训课程具有一定的借鉴意义。﹀
外文摘要：	︿ With the development of China’s comprehensive national power, increasing number of foreign tourists have been visiting China, bringing great opportunities to hotels in China. However, the spoken English of many hotel staff is not up to their hotels’ standards. It is an urgent issue for hotels to increase their staff's English speaking skills, so as to improve the service quality and achieve more success. Many studies on teaching English for hotel service have been done among students in colleges and universities. However, training for English oral skills of hotel staff has not been researched as much. Many hotels' training plans are copied from college curriculums, which makes the training inefficient. This study first briefly introduces the research background, research question, methods and research process. Then with a review of theories on English for specific purposes, this study analyzes the problems existing in current training courses. Taking hotel staff in Beijing as the research object, this study proposes a training model oriented towards serving foreign guests. A tailored course is then designed by making use of the results from interviewing foreign guests, collecting complaints from them and conducting a questionnaire survey among hotel staff. The results of teaching experiments conducted among two classes are fully analyzed. Being a useful attempt to apply theories of English for specific purposes to hotel English training, this study suggests a new way of designing English training courses and enriches the research on hotel in-service training. ﹀
分类号：	H319.3
论文总页数：	82
参考文献总数：	49
馆藏号：	017/M2013(0597)
公开日期：	2013-06-08

“熟词生义”现象及面向中国英语学习者的在线词典改进设计.瞿乔

链接

题名：	“熟词生义”现象及面向中国英语学习者的在线词典改进设计
姓名：	瞿乔
学号：	1001210666
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	熟词生义词汇深度知识学习中国英语学习者在线词典改进
论文摘要：	︿经济全球化背景下，国际分工日益加深，国家间合作不断扩大，中国也因此进入了对外开放的崭新时期。新形势下，提高国民的外语素质显得尤为重要。词汇和语法是语言教学中的主要内容。语法作为有穷的规则体系，一定学习投入后，可被较好掌握；而在“背单词”方法下，学习者的词汇量即使有了较快增长，实际运用能力却并不理想。 “熟词生义”是中国英语学习者词汇学习中的常见现象。文献分析后发现，该现象涉及学习者的词汇深度知识发展，是中国学习者普遍存在的词汇掌握问题。借助笔译活动，本文归纳和分析了该现象在语言解码、编码过程中的六种主要类型：解码过程中主要有“词性转换类生义”、“词义引申类生义”、“固定组合类生义”和“文化差异类生义”四个类型；编码过程中主要有“不善运用多义熟词”和“不善辨析近义熟词”两个类型。借助各类型的具体实例，本文进一步考察了现有新媒介词典（以光盘词典和在线词典为代表）在解决相应词汇深度知识问题时的缺陷和不足。从而，由实际问题出发，归纳了现有词典的局限性。在充分考虑在线词典现有内容及呈现特点的基础上，本文提出增设在线词典的高级检索功能，并实现了该功能所需“结构化义项关键信息”的属性设计。通过高级检索功能，学习者可以实现单词义项及关键搭配信息的直接查询，还能对一本英汉词典中的义项进行汉英查询。结合其它局限，本文还对在线词典的词条浏览页面、例证利用模式提出了改进需求。上述改进设计，既是对课堂教学的延伸和补充，也是网络环境下词典、语料库等语言资源的整合与创新利用。该方案的实现，有助于学习者在词典使用过程中补充相关词汇深度知识；有助于学习者方便快捷地查询单词搭配信息并辨析近义义项；有利于增加学习者的语言输出机会，提高语言运用能力。﹀
分类号：	TP391/TP314
论文总页数：	70
参考文献总数：	38
馆藏号：	017/M2013(0158)
公开日期：	2013-06-08

基于阅读策略的英语专业阅读课程教学设计.尤春丽

链接

题名：	基于阅读策略的英语专业阅读课程教学设计
姓名：	尤春丽
学号：	10917527
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	阅读策略阅读教学教学设计
论文摘要：	︿ 21世纪是一个竞争激烈的国际化时代。新的时代对我国大学生的英语能力也有了更高的要求。对于把英语作为第二语言的学生来说，英语阅读是提高英语能力的关键。阅读是学生掌握语言知识、打好语言基础、获取信息的重要渠道。而对于英语专业学生来说，阅读不只是为了通过专业四级和八级考试，顺利拿到证书，保证就业，更重要的是，阅读可以培养他们的人文综合素养。英语阅读对英语专业学生的重要性决定了加强英语专业阅读教学的必要性。本研究的目的在于，通过对专业四级阅读题目类型和使用的阅读策略的分析，将基于阅读策略的训练嵌入到英语专业阅读课程中，以提高学生专业四级阅读的能力。本文的研究方法包括文献研究、问卷调查、教师访谈和阅读任务设计。文献研究部分主要总结了国内外第二语言学习策略和阅读策略、语言课程设计和专业四级的研究成果。调查问卷用以了解学生对待阅读学习的态度和阅读策略掌握的情况。教师访谈主要是通过对青岛理工大学琴岛学院英语专业阅读教师授课现状的了解，掌握本研究设计的重点。阅读任务设计部分指的是，笔者遵循专业四级阅读题目类型的规律和专家研究这些题目类型中经常使用的阅读策略，为《现代大学英语精读2》（第一版）第一单元进行了基于阅读策略的教学设计。通过一段时间的小范围试用后，笔者使用2011年专业四级考试阅读测试考察实验效果。最后，本文进行经验总结，分析出研究存在的不足，并对未来研究进行展望。本研究还存在不足之处：有限的策略训练的时间和实验对象的数量使实验结果不能全面说明阅读策略训练的有效性。今后应深化设计的内容，延长策略训练的时间并选取更多的学生进行深入的策略训练的效果研究。﹀
分类号：	H615.9/H059
论文总页数：	64
参考文献总数：	31
馆藏号：	017/M2013(0604)
公开日期：	2013-06-08

翻译团队激励机制#xB;对提升翻译项目质量的研究.潘婧

链接

error

游戏翻译研究——以掌机游戏为例.张寅

链接

题名：	游戏翻译研究——以掌机游戏为例
姓名：	张寅
学号：	1001211009
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	翻译研究掌机游戏双重主体混合文化译者自由
论文摘要：	︿随着次世代主机竞争越发激烈，行货中文版软件日益增多，怎样才能为玩家奉上高品质中文化游戏软件为玩家提供良好的游戏体验成为新时期国内游戏软件行业需要关心的一点。而在这种诉求下，本文从文本的角度入手，以掌机游戏的翻译素材为基础，对游戏翻译进行探索研究。本文提出以下几个理论假设：游戏翻译具有混合文化性，双重主体性和译者自由性。而在这三个特性各自或混合的影响下，游戏翻译呈现出与其他领域翻译不同的翻译过程翻译策略翻译方法等。本文在提出假设后先从理论层面，结合前人对于翻译中文化的影响因素，翻译主体性研究，翻译策略研究等理论基础和研究方法对游戏翻译进行探索和分析；之后列举具体译例，从词汇和句段级别分别对之前的假设与理论分析提供实证以验证理论假设的正确性。全文共分五个主要章节，第一章为绪论，介绍研究的范围和研究背景；第二章为文献综述，对前人的工作进行评述和总结；第三部分是理论分析，对提出的假设进行理论层面的分析；第四章是翻译实例，通过实例验证理论分析的正确性；最后一章是总结与展望。﹀
外文摘要：	︿ As the competition of next-generation game devices is becoming intense, the quantity of licensed game software in Chinese increasingly grow, how to provide Chinese localization game software with high-quality to give the consumer a better game experience takes more and more concerns in the local game software field nowadays. To feed this need, it chooses the text aspect in this paper, bases on the translation materials from handheld game devices, gives an exploratory research to the game translation. In this paper, it presents following theoretical assumptions: game translation contains the feature of mixed culture, double translator subjects and translator latitude. With the influence of these three features each or together, there are different translation processes, different translation methods and different translation strategies between game translation and translation in other areas. After it gives out the theoretical assumption, it begins the theoretical research and analysis in game translation with the profits which previous scholars have got in the area of culture influence in translation, research of translation subject and translation strategy research. Then it shows several specific translation examples which are from word level to sentence level. It is used to support the pervious theoretical assumption. There are five main chapters in this paper. Chapter 1 is the introduction, in which it introduces the range and background of the research. Chapter2 is the literature review, in which it makes comments and conclusion of previous scholars’ research. Chapter 3 is theoretical analysis, in which it gives theoretical analysis to the assumption. Chapter 4 is transition examples, in which it supports the previous assumption. The last chapter is conclusion. ﹀
分类号：	H059/TP311.52
论文总页数：	62
参考文献总数：	73
馆藏号：	017/M2013(0420)
公开日期：	2013-06-08

中美英军事新闻英语语体特征对比研究.许文锋

链接

题名：	中美英军事新闻英语语体特征对比研究
姓名：	许文锋
学号：	1001210508
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
外文题名：	The comparative study of linguostylistic features of English military news in China, the US and the UK
关键词：	军事新闻新闻英语语体语体特征
外文关键词：	Military News Journalistic English Register Linguostylistic feature
论文摘要：	︿本研究以语体理论为指导，从中美英军事新闻网站上选取约22万字的新闻语料，采用定量分析和定性分析相结合的方法，揭示了中美英军事新闻英语在词汇语法、语篇和修辞方面的异同，并结合对外传播理论给出了相应的创作策略，以期能够帮助改进我国英文军事新闻的创作工作，使其取得更好的对外传播效果。在词汇语法层，本研究运用了Douglas Biber教授提出的多维度分析理论，借助单因素方差分析，确定了33个语言特征进行因子分析，发现中美英军事新闻英语在五个维度上存在差异。研究表明，中国军事新闻英语在信息性和概括性方面较强，美国和英国军事新闻英语在非正式性、互动性和时效性方面较强；美国军事新闻英语在阐述性方面最强，中国居中，英国最弱。从传播学角度看，美英军事新闻在新闻的亲近性方面较强，容易吸引受众；中国英文军事新闻更偏重提供信息，多用概括性、抽象性语言，因此新闻的亲近性较弱，不利于吸引受众。在语篇层，本研究运用了系统功能语法中的语言元功能理论，从及物性系统、情态成分系统和主位结构方面分析了中美英军事新闻英语语体特征的异同。中美英军事新闻都大量运用物质过程，凸显了语篇的叙述性。美英军事新闻语篇善用新闻背景，常常采用平民化叙事视角；而中国语篇多罗列新闻背景，采用官方视角。中美军事新闻大量使用言语过程，以增加权威性，英国语篇却较少使用此过程。中国军事新闻英文语篇较少使用情态成分，情态成分以情态动词为主，直白地表达了作者的观点；美英语篇却较多地使用情态成分，而且相对于中国语篇，使用较多的情态附接语和谓词的扩展形式，隐蔽地表达了作者的观点。中美军事新闻语篇较多地使用标记性主位，透露出作者的主观态度，而英国语篇很少使用标记性主位。在修辞方面，本研究选取了中美英军事新闻各20篇，统计了比喻、典故、双关、反语和委婉语在这些语篇中的使用情况。美英语篇较多地使用这些修辞手法，有效地增强了语篇的文采性。中国军事新闻英文语篇很少使用这些修辞格，有时因为文化冲突造成错译，给新闻语篇带来了负面效果。﹀
外文摘要：	︿ Guided by the linguistic theory of register, this thesis selects English military news of about 220,000 words, adopts the approach of quantitative analysis and qualitative analysis combined and whereby reveals the differences of English military news in China, the US and the UK at the levels of lexis and grammar, discourse and figures of speech. Then appropriate news production strategies are provided from the perspective of international communication in order to help our country produce better English military news with greater effects of international communication. As for the lexical and grammatical differences, this research follows the multi-dimensional analysis model by Douglas Biber, chooses 33 linguistic features to be applied in the factor analysis with the help of One-Way ANOVA and then concludes that the English military news in the three countries show differences on five dimensions. The English military news in China features an orientation of informational production and generality whereas the news in the US and the UK informality, interactivity and timeliness. On the scale of expressiveness, the news of the US occupies a top position, that of China a middle position and the UK a low position. From the perspective of the international communication, the English military news of the US and the UK has a high value of proximity so it is more likely to attract readers. However, the English military news of China tends to provide information and adopt general and abstract language, which makes it less proximate to readers. As for the differences at the level of discourse, this paper applies the theory of language metafunctions in systemic functional grammar to the discussion of the linguostylistic features of the English military news in terms of transitivity, modality and thematic structure. The news in the three countries all use material process clauses extensively, which reflects that narration is the basic function of news reports. But during the narration the American and British news discourses always value news background and tell stories of individuals and ordinary people while the Chinese side merely lists background information and uses condescending narratives. Both the American news discourses and Chinese ones use a high proportion of verbal process clauses to sound more authoritative and reliable but this process is much less used in the news of UK. Chinese military news discourses use a small amount of modality elements of which a major part is constituted by finite modal operators which bluntly reveal authors’ attitudes. Nevertheless, American and British news discourses use more modality elements of which modal adjuncts and expansions of the predicator occupies a larger proportion compared with that in Chinese discourses. This hides authors’ attitudes. The military news discourses in the three countries use unmarked theme structures extensively but the British discourses use much fewer marked theme structures than Chinese and American ones, so the discourses in these two countries more explicitly show authors’ attitudes and emotions. As far as the use of the figures of speech is concerned, this paper randomly selects 20 news discourses from Chinese, American and British English military news corpora respectively and then conducts a statistical analysis of simile and metaphor, allusion, pun, irony and euphemism. The American and British news discourses are more likely to use these figures of speech, which adds to the literary beauty and profoundness of the news discourses while the Chinese news discourses seldom use these devices. Sometimes these figures of speech in the English military news discourses of China suffer inappropriate translations for neglect and misunderstanding of their cultural implications and thereby bring negative effects on the discourses. ﹀
分类号：	H087/TP311
论文总页数：	84
参考文献总数：	81
馆藏号：	017/M2013(0052)
公开日期：	2013-06-08

人物专访的译注问题——以 100 New Yorkers of the 1970s 为例.曹广慧

链接

题名：	人物专访的译注问题——以 100 New Yorkers of the 1970s 为例
姓名：	曹广慧
学号：	1001210518
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-08
外文题名：	Annotating Profiles in Translation: A Case Study of 100 New Yorkers of the 1970s
关键词：	人物专访翻译译注工具
外文关键词：	Profile Translation Annotation Tools
论文摘要：	︿《百位1970年代纽约人》出版于2005年，是由作者麦克斯·米勒德为纽约本地报纸《电视信息报》所写的210篇人物专访中精选出一百篇结册，采访对象多为艺术界和娱乐界名人。人物专访往往具有一定的时效性，涉及诸多时代背景要素，会提及各种人物、地点、组织、事件等等。因此为使读者理解语境，在翻译时对这类词语应稍作注解。一则为读者提供相关的信息，二则对译文起补偿作用，为中文读者还原英语读者在原语言环境和时代背景下接收到的预期效果。本文主要讨论针对名称的译注以及笔者个人在翻译中实际使用的网络资料搜索方法。为还原语境，对所有中文读者不甚熟悉的名字都增加了简介、与本书的内容有所关联的信息、以及可能使中国读者建立起直观印象的事迹等等。﹀
外文摘要：	︿ 100 New Yorkers of the 1970s written by Max Millard (2005) is a collection of 100 stories selected from a total of 210 interviews of mostly people from the arts and entertaining industry that Millard conducted for a local newspaper TV Shopper in the late 1970s in New York. When translated, these profiles, with their abundant names of people, locations, organizations and events peculiar to the time, present a difficulty for the target reader. They have to be annotated in order to facilitate a better understanding as well as restore a historical and cultural context which would be familiar to American readers but would, more often than not, evade Chinese readers. The present report mainly discusses the annotations of the names and the translator’s methods of hunting them down through the Internet. All names that Chinese readers might be unfamiliar with are supplied with information such as the interviewee’s identities and most famous achievements, and if possible, relevant or familiar facts that might help readers understand. ﹀
分类号：	H087/TP391
论文总页数：	182
参考文献总数：	17
馆藏号：	017/M2013(0058)
公开日期：	2013-06-08

商业翻译团队成员人格特质对团队绩效的影响研究.潘媛

链接

题名：	商业翻译团队成员人格特质对团队绩效的影响研究
姓名：	潘媛
学号：	1001210780
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	翻译团队人格特质互动团队绩效
论文摘要：	︿为了适应商业翻译市场质量和时效的双重需求，翻译团队成为目前商业翻译领域的主流工作模式，但对其研究重视程度却远不如独立个体的翻译实践研究。事实上，翻译团队经常出现由于团队成员之间协作不畅而导致翻译效率低下、译文质量不合格的情况发生。前人研究表明，团队成员特质与团队绩效有着密切的联系。翻译团队作为团队类别的一种，对其团队的相关研究很少从人格特质这一角度进行。本研究将以北京译无止境翻译公司为例，从翻译团队成员人格特质的角度对商业翻译团队绩效的影响进行分析。本研究通过对翻译项目团队进行深度访谈，基于团队效能【输入-过程-输出】（简称IPO）模型，以翻译团队成员人格组成为输入变量，以成员间的沟通、支持和人际冲突为过程中介变量，以译文一致度和成员能力成长为输出变量，构建了翻译团队的IPO模型，收集了19个翻译团队的团队绩效模型数据，以此进行人格特质对翻译团队绩效影响的实证研究,结果发现翻译团队成员的宜人性和严谨性对团队互动过程和团队绩效有显著影响。在观察团队成员互动过程中则发现，高水平译员的宜人性对团队内沟通和支持起到显著影响，高外倾性对活跃团队沟通氛围非常有效，而严谨性则能够在一定程度上弥补低外倾性对团队绩效的影响。根据研究成果，本研究从人格特质的角度为翻译团队的组建提供了一定的建议。﹀
分类号：	H059/F270
论文总页数：	76
参考文献总数：	40
馆藏号：	017/M2013(0246)
公开日期：	2013-06-08

从景观翻译看旅游文体的翻译美感体现.玉薇敏

链接

题名：	从景观翻译看旅游文体的翻译美感体现
姓名：	玉薇敏
学号：	1001210964
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
论文答辩日期：	2013-06-08
关键词：	英中翻译景观翻译翻译美感音美形美
论文摘要：	︿在旅游业高速发展的今天，外国旅游文体翻译是中国游客感知国外文化的重要渠道。本文选择旅游类书籍《女人必须去的100个意大利景点》（100 Places in Italy Every Woman Should Go）进行翻译，翻译内容涉及教堂、历史遗迹、博物馆、著名建筑、雕像油画、别墅和花园。女性视角作为本书的一大特色，在书籍的翻译过程中得到很好的体现。本文针对旅游文体中的景观翻译，从音美和形美两方面探讨译者如何体现景观翻译的美感。音美主要从尾韵、拟声词和叠音词的应用出发，通过恰当地调整音律达到音韵的美。形美主要从词汇美、句子美和段落美出发，提出译者可运用四字成语，以达到辞藻的繁复和华美，让读者直观地感受到文字的美丽，产生阅读兴趣。﹀
外文摘要：	︿ Foreign tourism translation is a vital means for Chinese travelers to learn about foreign culture in today’s rapid development of tourism. 100 Places in Italy Every Woman Should Go (2010) by the American travel writer Susan Van Allen offers a unique female perspective of Italy’s historical sites and famous architecture such as its churches, museums, statues, paintings, villas and gardens. The present report analyzes the aesthetics in the target text as deriving from the beauty of the Chinese language in sound and form. Through the application of particular end rhymes and onomatopoeia, Chinese translation achieves a beauty of sound. The beauty of form is obtained on three levels as words, sentences and paragraphs are rendered through an extensive use of four-character idioms. ﹀
分类号：	H059/H315.9
论文总页数：	199
参考文献总数：	11
馆藏号：	017/M2013(0385)
公开日期：	2013-06-08

小说中宗教文化的翻译策略研究——以《This Time Forever》中译为例.官毅

链接

题名：	小说中宗教文化的翻译策略研究——以《This Time Forever》中译为例
姓名：	官毅
学号：	1001210607
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
外文题名：	On the Translation Strategies of Religious Culture in the Novel——Case Study of Ths Time Forever
关键词：	小说宗教文化摩尔门教翻译策略
外文关键词：	novel religious culture Mormon translation strategies
论文摘要：	︿《此时永恒》是美国作家Rachel Ann Nunes所写的小说。它以朴实的语言描述了摩尔门教徒的家庭生活，以及宗教文化对人物的为人处事、婚姻观念等的影响。本文的研究对象是这本小说中的宗教文化及其翻译策略。具体的翻译策略如下：小说中出现的宗教文化包含宗教术语，翻译时要注意术语的正确性和统一性。对于人物对话中出现的宗教术语，适合使用注释法或增补法，为控制注释的数量，对不重要的宗教术语进行解释性的扩展而不加注。对于宗教词汇中一词多义的现象，要根据语境选择其含义。小说人物中体现宗教思想的语句，主要采取转换法：将被动转为主动，名词转换为动词，以符合汉语重动态的特点。小说中对摩尔门教徒的婚姻观进行了描写，翻译时总结了其婚姻观里非常重视宗教信仰，而且对英汉语句子重心进行了分析，指出英语句子中重心靠前，而汉语的句子重心在后，因此翻译时要根据汉语的习惯来调整语序。对于礼拜场景中的英文定语从句，可以采取抽取句子主干的方法，再按照汉语修饰成分在前的特点加入其修饰语，这样使定语从句与句子主干相融合，翻译成一个紧缩复句。总之，译者要充分了解宗教的背景知识，理解其宗教文化再进行翻译。在翻译实践中，译者可以采用异化的翻译策略保留原文宗教文化的特色，但同时要考虑读者的理解能力和阅读体验，采用归化的翻译策略使译文通顺流畅。﹀
外文摘要：	︿ This Time Forever is a novel written by American writer Rachel Ann Nunes about the life and marriage of a Mormon family in Utah. The present report studies the religious culture in this novel and its translation strategy. Since Mormonism is largely unfamiliar to most Chinese readers, special attention is paid to its translation. Religious terminology concerning Mormonism’s organization and activities are translated strictly according to the designation of the official website of the Church of Jesus Christ of Latter-day Saints and the MormonWiki. Transaltion strategies of foreignization and domestication are adopted both to preserve the religious aspect of the novel and to render it easier for Chinese readers’ understanding. Methods of annotation and supplementation are also applied where necessary. Besides, the translation of long sentences is discussed with much detail. ﹀
分类号：	H059/H315.9
论文总页数：	178
参考文献总数：	18
馆藏号：	017/M2013(0124)
公开日期：	2013-06-08

译者主体性下的通俗读物语言翻译.关怡然

链接

题名：	译者主体性下的通俗读物语言翻译
姓名：	关怡然
学号：	1001210604
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	译者主体性翻译活动各方利益通俗读物星座占星术
外文关键词：	translator’s subjectivity interests of all translation parties popular book astrological signs astrology
论文摘要：	︿此次翻译实践中，笔者感受到翻译是一个超越文字层面的错综复杂的问题，涉及到出版商、目标市场、读者预期、原作者、原文语言、原文背景文化、译入语社会背景、文化背景等各方面的利益。尤其本次翻译实践完成的中文版图书被出版社定位为通俗读物，与原书和原作者的定位有所差别，更需要译者在翻译中发挥主体性。多方利益的实现只有在译者的综合权衡下才有可能达成，因此，译者是翻译活动的主体，甚至起到“中心”作用，这就是译者的主体性。国内学者对译者主体性的定位也认为，译者主体性体现在译者同翻译活动中其他对象的互动关系中，是一种从宏观角度对翻译活动的审视。本文中，笔者谨以 The Secret Language of Relationships—Your Complete Personology Guide to Any Relationship with Anyone 一书的部分翻译为例，论述笔者作为译者如何发挥主体性考虑各方利益、并依据各方利益的引导制定合适的翻译策略、最终将翻译策略贯彻到翻译实践中去。﹀
外文摘要：	︿ In this translation practice, it is recognized that translation is not only about words and texts. In fact, it is a most complicated process concerning many parties: the publisher, the market, the reader’s expectation, the source text, its author and cultural background, as well as the social and cultural backgrounds of the target language. This is especially ture in this translation practice, since the product of this practice is determined by the Chinese publisher as a popular book, which is different from the author’s self-orientation. So the translator has to seek a translation strategy to balance all parties’ interests, for which purpose he/she might resort to his/her subjectivity as the mediating ground. Taking the translation of The Secret Language of Relationships—Your Complete Personology Guide to Any Relationship with Anyone as an example, this present report aims at elaborating how this particular translation strategy forged to integrate all parties’ interests can be put into practice with the employment of translator’s subjectivity. ﹀
分类号：	H059
论文总页数：	28
参考文献总数：	14
馆藏号：	017/M2013(0123)
公开日期：	2013-06-08

在线同伴反馈翻译教学研究.赵玉涛

链接

题名：	在线同伴反馈翻译教学研究
姓名：	赵玉涛
学号：	1001211028
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	翻译教学在线同伴反馈 CATTP
论文摘要：	︿国内外关于同伴反馈的研究主要集中在写作教学领域，翻译教学领域鲜有涉及。同伴反馈写作教学研究表明，同伴反馈能够改善学生的写作状况，对学生写作成绩提高有着促进作用。鉴于同伴反馈对写作教学的积极影响，有必要尝试将同伴反馈教学方法迁移到翻译教学中来，对同伴反馈翻译教学进行深入的探索和研究。本文借助北京大学语言信息工程系研发的计算机辅助译者训练平台（Computer-Aided Translator Training Platform，简称CATTP）在西安外国语大学本科翻译专业三年级进行了对比实验，重点研究了学生行为、翻译错误和教学效果。本文从反馈这一概念入手，首先介绍了反馈教学的理论基础和反馈的分类，归纳了当前同伴反馈写作教学的研究内容，综述了在线同伴反馈写作教学的现状，重点介绍了实验研究中用到的重要技术工具CATTP。其次，从研究目的、研究对象、研究工具、实验过程几个方面详细阐述了整个翻译教学实验。再次，对从翻译教学实验中获得数据进行描述和分析，具体内容包括学生的反馈行为和修改行为，学生译作中的翻译错误、实验前后的学习成绩，学生对同伴反馈的态度。通过分析得出实验结论。最后，指出本文的贡献和不足，并对未来工作进行展望。研究表明，在线同伴反馈能够帮助学生提高翻译学习成绩。在完成和修改译文的过程中，学生对译文的词汇、语法、表达给予了较多的关注。总体看来，多数学生表示在线同伴反馈他们的翻译学习有益，在线同伴反馈使他们有机会进行作业对比，互相学习，实现借鉴和反思。﹀
分类号：	H085/H059
论文总页数：	88
参考文献总数：	0
馆藏号：	017/M2013(0437)
公开日期：	2013-06-08

软件用户手册的英译汉研究—以《Wordsmith Tools Manual-Version 6.0》翻译工作为例.孙俊方

链接

题名：	软件用户手册的英译汉研究—以《Wordsmith Tools Manual-Version 6.0》翻译工作为例
姓名：	孙俊方
学号：	1001210823
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	软件用户手册科技文体实用文体特点翻译
论文摘要：	︿软件用户手册是用户了解软件的重要途径之一。目前有很多软件是来自国外的，而中国的用户需要使用本国语言的用户手册才能更快的学习软件。用户手册的翻译也逐渐受到更多的关注。科技翻译发展到今天，有人将说明书和用户手册的翻译归入到科技翻译的范畴，并对整体的科技翻译进行研究。但是也有研究者认为用户手册的实用性很强，将其划入特别的实用文体的研究范围，并对其进行研究。笔者尝试在本文中专门针对用户手册进行研究，结合自己的翻译实践，总结出了软件用户手册的总体特点以及词句特点，并提出了相应的翻译策略：（1）软件用户手册的整体特征。软件用户手册具有实用文体的部分特征：信息性、劝导行和功利性，以及科技文体的部分特征：术语性、符号性和准确性。除此之外，软件用户手册还具有大量的不译因素，且行文语气亲切。（2）笔者总结了软件用户手册的词汇特点：普通词作为专业词汇、多使用缩略语。词汇的翻译策略有：保留原文并附加翻译；借助词典和书籍，沿用已有翻译；词典加定义选择最佳翻译；零翻译。（3）软件用户手册的句子特点：句式多，祈使句、条件假设句、被动句等使用频繁；各种句式结合其他成分组成长句。祈使句和条件假设句的情况和翻译比较简单。笔者总结了几种被动句的翻译策略。长句的翻译采用结构分析填充法。笔者在文章中补充了这个方法的后续处理的几种情况。笔者的目的在于专门针对用户手册的特点进行详细分析，并应用详细的翻译策略翻译用户手册，为用户手册的英译汉过程提供翻译支持。﹀
分类号：	H059/H315.9
论文总页数：	30
参考文献总数：	18
馆藏号：	017/M2013(0278)
公开日期：	2013-06-08

“零度”视角下对涉及道德问题的文本的翻译.苏畅

链接

题名：	“零度”视角下对涉及道德问题的文本的翻译
姓名：	苏畅
学号：	1001210815
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李博婷
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-08
外文题名：	Translation of Morally Problematic Text in the View of “Zero Degree”
关键词：	零度道德问题相关文本英汉翻译
外文关键词：	Zero Degree Morally Problematic Text English to Chinese Translation
论文摘要：	︿出于对美国反正统文化历史的兴趣，以及对摇滚乐的喜爱，笔者选译了红辣椒乐队（Red Hot Chilli Peppers）主唱安东尼·凯迪斯（Anthony Kiedis）的自传《疤痕组织》（Scar Tissue）。在节选的第四章至第七章文本中，原作者语言风格独特真诚，在向读者展现他坎坷崎岖而又奇幻炽热的音乐之旅的同时，也不乏关于毒品、性爱的描写。作者对社会主流价值观的反叛，对毒品的推崇以及性爱自由的观念，无论对译者的翻译过程还是读者的阅读接受都极具挑战。对于此类有可能引起道德争论的文本的翻译，无论译者还是读者都最好不以道德批判家的姿态出现，因为任何文本都具有其特定的文化和社会价值。笔者在本篇翻译报告中，将以零度视角分析如何对此类文本进行最大限度的理解，和如何向读者展现最贴近原文的翻译。﹀
外文摘要：	︿ Due to my interest in the American countercultural history and fondness for the rock music, I have selected the autobiography Scar Tissue written by Anthony Kiedis, lead singer of the famous American rock band Red Hot Chilli Peppers. My excerpt is from chapter 4 to chapter 7 in which the author describes his tough and fantastic road to music in a unique language and with an honest attitude. He also gives a bold description of his life filled with drugs, sex and cheating, which poses a moral challenge both to the translator and the reader. Though it is such a morally problematic book, both the translator and the reader should not be judgmental, because any text has its own cultural or social meaning. In this report, I will analyze in the view of zero degree hermeneutics how to understand such morally disturbing text and how to provide the most close to original translation to the reader. ﹀
分类号：	H059/H315.9
论文总页数：	189
参考文献总数：	9
馆藏号：	017/M2013(0270)
公开日期：	2013-06-08

商务英语文体风格三因素分析法及翻译应用.张萌

链接

题名：	商务英语文体风格三因素分析法及翻译应用
姓名：	张萌
学号：	1001210994
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-08
关键词：	商务英语功能加忠诚文体分析翻译
外文关键词：	Business English Function and Loyalty Stylistic Analysis Translation
论文摘要：	︿今天，随着全球范围内商务活动的快速发展，商务英语的重要地位日益突出，对商务翻译人员的翻译质量提出了更高的要求。在商务英语翻译中，如果不能很好地理解原文的文体风格，所得到的译文就很难达到原文所预期的目的。“商务英语的文体意识淡薄，严重制约和影响了译文的质量和商务英语的教学质量。” 作为一种重要的英语文体，许多学者对商务英语的文体风格和翻译原则进行了大量研究，但对商务英语文体风格的认识莫衷一是，甚至有的相互矛盾。为解决风格描述之间的矛盾，有的文献将商务文体再细分为各种功能体裁，来分别讨论风格。本文认为，由于商务英语涉及领域广，题材跨度大，沟通层次多、情况复杂，所以商务英语文体风格的决定因素并不是单一的，而应是多重因素综合考虑。以偏概全地总结商务英语文体风格，很可能对商务翻译人员造成误导，间接造成了目前商务译文中充斥着大量不分场合的商务套语和过分的礼貌客气。本文以功能加忠诚理论为指导，以文体分析为方法论，在结合理论、文献、实践经验三方面基础上，提出了分析商务翻译文体风格的三因素法。并建议译员应在了解和分析上述三因素的基础上，为译文选择合适的词汇、句法、和语篇风格，从而形成恰如其分的整体译文风格。这样才能做到最大程度实现原文目标、并忠诚于翻译活动中的发起人和接收人、以及他们之间的社会关系。﹀
外文摘要：	︿ Today, with the rapid development of global business activities, the importance of Business English is increasingly prominent. This put forward higher requirements for the quality of business translation. In business English translation, if the style of the source document is not well understood, the resulting translation would be difficult to achieve the original intended purpose. "Insufficient consciousness of business English style has been severely constraining and undermining the quality of translation and teaching of business English." As an important English style, many scholars have engaged in research on Business English Style and translation principles. However, the cognitions of Business English Style are not unanimous, and some opinions are even contradicting. In order to resolve the contradiction between the style descriptions, some literatures subdived business Englsih into various founctional genres, of which the styles are discussed respectively. This paper argues that, as Business English relates to the fields of wide span, uses in multi-level communications, and has complex situations, the determinants of the Business English Style should not be a single factor, but multiple factors to consider. Sweeping summary of the Business English Style may be misleading to business translators, and indirectly led to the business translation flooded with a large number of business fixed patterns and excessive politeness regardless of the occasion. This paper utilizes the Function plus Loyalty theory as a guide, with stylistic analysis as methodology. Based on the theoretical study, literature research, and practical experience, it proposes the “Three-Factor Method” as a way to analyze business translation st yles. And it suggests that translators should understand and analyze these three factors, based on which one shall select the appropriate lexical, syntax, and textural styles, thus form appropriate overall translation style, so as to achieve the maximum extent to achieve the skopos of the source document and be loyal to sponsors and recipients in translation activities, and to social relations between them. ﹀
分类号：	H315.9/H059
论文总页数：	160
参考文献总数：	26
馆藏号：	017/M2013(0411)
公开日期：	2013-06-08

功能对等理论视角下的英语长句汉译策略研究.陆遥

链接

题名：	功能对等理论视角下的英语长句汉译策略研究
姓名：	陆遥
学号：	1001210757
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	姚亚芝
导师1单位：	北京交通大学语言与传播学院
论文答辩日期：	2013-06-08
关键词：	长句翻译功能对等
论文摘要：	︿英语长句的翻译是译者在英汉翻译时经常会遇到的难题之一。相对于汉语，英语长句的长度较长，结构也更为复杂，如果翻译处理不恰当，则会产生不通顺、不恰当、翻译腔较重的译文，影响译文读者对文章的理解，更不用说获得与原文读者相似的感受，因此对长句翻译进行研究是非常必要的。本次项目所选择的翻译材料包括《大学成功之道》全书以及《我是怎样获得学位》一书的节选。前者是在作者马尔科姆·高尔德对数百名大学在校生和毕业生进行采访之后，总结整理而成，其中包含了五条指导大学生如何度过充实大学生活的原则；后者是加州大学校长大卫·加德纳的回忆录，其中涉及到美国大学生活的许多方面，可作为前书的背景参考。两者均为面向大众的普及型读物，这一文本特征决定了在翻译时应当以读者为中心、让读者获得对原文足够充分的理解。因此，笔者选择了以读者感受为中心的功能对等理论作为翻译指导原则，该理论一方面给译者很大的自由空间，另一方面为译者提供了方向性的指导思想。在本翻译报告中，笔者基于功能对等理论以及实践当中遇到的实际问题，总结出了前移后置状语、拆分定语成分、补充省略成分、调整插入语位置、排列时间顺序等翻译方法，这些方法在处理英语长句翻译时是行之有效的。﹀
分类号：	H059/H315.9
论文总页数：	210
参考文献总数：	0
馆藏号：	017/M2013(0229)
公开日期：	2013-06-08

英汉翻译中语篇逻辑连接的实例研究.王倩

链接

题名：	英汉翻译中语篇逻辑连接的实例研究
姓名：	王倩
学号：	1001210868
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	外国语学院
论文答辩日期：	2013-06-08
关键词：	逻辑连接连接关系转换隐性连接显性连接
论文摘要：	︿连接是语篇衔接的重要手段之一。一篇文章如果只有文字搭建的框架，远远不能满足读者理解文章意义的需求。因此，要使文章所传达的意思紧密相连，便要有逻辑连接关系的存在。在分别分析过英汉语言逻辑连接的特点之后，我们发现在翻译过程中恰当合理地使用逻辑连接成分是使译文通顺流畅的重要手段之一。因此，本报告中将以翻译的书目——《重走大一生活》为例，探讨和研究英汉翻译中对逻辑关系的处理。所翻译的书目描写是一位有十五年教龄的人类学教授重新作为大一新生，去体会美国当代大学生校园生活与文化。书中常出现对学生日常生活和课堂事件的真实描写，语言风格朴实无华，简单易懂。但在翻译时，译者却发现要向读者呈现切实的场景和事件过程，使读者明了其前后的来龙去脉，就要恰当地处理英文中转折、时间、因果等逻辑连接关系。因此，语篇逻辑连接关系是否清晰连贯、译文内容是否清楚明了就变得十分重要。由于两种语言产生于不同的文化氛围且具有截然不同的语言特征，造就了英汉不同的语篇衔接特点。在翻译时，清楚两种语言的逻辑连接特点，形成从一种语言表达方式到另一种语言表达方式的转换对于译文质量起到至关重要的作用。本翻译研究报告分为五个章节，第一章绪论为“提出问题”的章节：介绍了本翻译项目所选书目的基本情况和其语言特点以及译者在翻译过程中所遇到的问题。第二章中，阐述了逻辑连接关系的理论框架，其中包括英汉逻辑连接的分类、英汉逻辑连接分别的特点以及逻辑连接与翻译的关系。在第三章中，译者分门别类地列举出翻译中涉及到逻辑连接关系的实例，既包含单一逻辑连接关系的实例，也包含在翻译过程中逻辑连接关系发生转换的例子。并且对翻译和审校过程进行分析。第四章为“问题解决策略”，根据列出的翻译实例总结逻辑连接翻译规律及解决策略。第五章进行总结，提出本文局限性和对下一步研究工作的展望。﹀
分类号：	H059/H315.9
论文总页数：	225
参考文献总数：	0
馆藏号：	017/M2013(0310)
公开日期：	2013-06-08

体育传记文本中方式副词的变译方法探究——以 Red Men: Liverpool Football Club – The Biography 中译为例.周游

链接

题名：	体育传记文本中方式副词的变译方法探究——以 Red Men: Liverpool Football Club – The Biography 中译为例
姓名：	周游
学号：	1001211053
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	中国人民大学外国语学院
论文答辩日期：	2013-06-08
外文题名：	A Research in Translation Variation Methods for Sports Biography – Take the Chinese Version of the Book Red Men: Liverpool Football Club – The Biography as an Example
关键词：	体育传记方式副词文本表达方式变译
外文关键词：	sports biography adverb of manner expression way of text translation variation
论文摘要：	︿英国是现代足球的发源地，利物浦作为坐落英格兰西北、默西河畔的海港城市，则是英国足球重镇。本文翻译研究对象 Red Men：Liverpool Football Club - The Biography 一书介绍利物浦当地传统球队利物浦足球俱乐部的诞生和发展史。笔者在体育领域传记文体的英中翻译过程中，对各种情景、情感、记叙方式或评论模式下的方式副词的翻译策略进行探究。笔者以表达方式来划分文本，对不同文本中的方式副词区别对待，以求达到遵从原文信息传达、还原原文文风、贴近汉语表达习惯的翻译效果。笔者考量了前人对于方式副词的变译策略之长短，为探究的文本分配了相应的变译策略，并通过对选书中一定数量的文本案例进行分析，来印证其实效性。按表达方式划分：叙述性语言多为记事，作者采用了相应的叙事节奏、选词偏好、或是语音层面的协调感来配合不同的叙事需求，变译策略应满足该需求。对描写性语言则应按照描摹对象的人物性格、景物特征、场面情状和细节特质等区分开来，分别讨论变译方法。而在比较、分类说明或是比喻、诠释说明性的语言中，为求精确明晰地展示对象特质，应采取简洁化、通俗化的变译思路。在议论性语言中，异化策略是：变副词为其他词性，重新编排句子表述重点，平衡句子重心，以使富于战斗性的议论性句子更易发挥其紧凑篇章、贯通文脉的作用。对于抒情文本，通过增译、处理为四字双声叠韵词等办法，当繁复则繁复、当简约则简约、当前后呼应则前后呼应，从抒情中来，为抒情服务，以求满足原文情感传达和气氛烘托的需求。﹀
外文摘要：	︿ Britain is the birthplace of modern football. Liverpool, a city lies aside Mersey River, located northwest England, is of great significance to both British and modern football. The research object of this article is the book: Red Men: Liverpool Football Club - The Biography, a biography. The book describes the birth and history of the development of traditional Liverpool local team Liverpool Football Club. In the field of English sports biography style translation processing, the author explored the variation translation methods for the adverb of manner in various scenarios, emotional, narrative strategies or comment modes as the researching objects. The author divides the manner adverbs by the different expression ways of texts and relatively throws the corresponding translation variation method to each of them, by which the authentically original messages, styles and pure Chinese expressing ways could be implemented to the translation. Considering of the strengths and the weaknesses of the translation variation methods of adverb of manner by the former researchers, the author applied in corresponding ways for the texts being researched, and he analysed the texts so as to verify the effectiveness of each way chose. Divided by different expression ways of texts – As for narrating stories, the author of the biography puts proper rhythm of narrating, preferred style of words and intonation symphony into the narrative language. So the translation methods should be fit in this situation. The descriptive language includes the languages aiming to depict different object, as personal or scenery characteristics, or scenes for situation and traits of details, and as to these kinds of texts, it should be discussed separately to take any translation variation method. And as for presenting the character of the specific object precisely, it should be concise and common expressions that should be taken into illustrative languages for comparison, classification, annotation or a simile. The discursive language is strongly combative, and it has a feature of good at tightening the writings and connecting the vessels of an article. To strengthen this feature, it should to perform word-activities to the adverb for rearranging the focus of the expressions and balancing the core of the sentence. And as to the lyric language, the author adopted the methods of amplification, transforming the original texts to four-letter Chinese words with the same vowel formation, as to make the original text complicated or simplified, or to set up image-connections for emotion expressing or atmosphere colouring. It is of the lyric language and serves the lyric language itself. ﹀
分类号：	H059/H315.9
论文总页数：	155
参考文献总数：	8
馆藏号：	017/M2013(0455)
公开日期：	2013-06-08

从文化预设的角度探讨励志类文本的英汉翻译策略——以《Make More, Worry Less》英汉翻译为例.王方圆

链接

题名：	从文化预设的角度探讨励志类文本的英汉翻译策略——以《Make More, Worry Less》英汉翻译为例
姓名：	王方圆
学号：	1001210856
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	中国人民大学外国语学院
论文答辩日期：	2013-06-08
关键词：	文化预设励志类文本翻译策略
外文关键词：	Cultural Presupposition Text of Self-help Type Translation Approaches
论文摘要：	︿文化预设是一个社会群体所普遍接受并共享的文化理念，它以潜在假设的形式存在于该社会成员的思想中，成为语言交流的隐含前提。在跨文化的翻译活动中，由于相同的语言表述并不一定能在作者和译文读者脑海中形成相同的文化预设，因此，由文化预设造成的文化误译和文化误读时有发生。文化预设有共有和非共有之分：共有的文化预设源自于两种文化的交集，它能被拥有不同文化背景的人所解读；非共有的文化预设源自于两种文化的差异部分，是文化交流中的障碍，是翻译中需要着重解决的问题。译者只有在对这两种文化预设有深切感知的基础上采取相应的翻译策略，才能产出高质量的译文。本文以励志类文本《Make More, Worry Less》的英汉翻译为例，在参考前人研究的基础上，对文化预设问题的翻译策略进行了归纳总结和举证分析，主要阐述了四种策略：直译法、解释法、替代法和过滤法。直译法是对原文文化信息的直译，主要应用于共有的文化预设。对于非共有的文化预设则可以分别采取解释法、替代法和过滤法三种策略。解释法是在直译基础上的解释，可分为文内解释和文外注释两种处理方式。替代法是指用目的语中常用的文化意象替代原文中陌生而又费解的文化意象。过滤法是指对原文的文化信息进行文化过滤，即直接译出文化意象的预设含义，而将文化形象过滤掉。这四种策略能有效地处理励志类文本中经常出现的文化预设问题，对提高译文质量具有现实的意义。关键词：文化预设；励志类文本；翻译策略﹀
外文摘要：	︿ Cultural presupposition refers to the cultural concept accepted and shared by a certain community. It exists in the minds of community members in a form of potential hypothesis and becomes the suppressed premise of language communication. During the translation process, the same description may generate different cultural presuppositions between the mind of an author and the minds of target readers, so cultural mistranslations and misunderstandings caused by cultural presupposition occur from time to time. Cultural presupposition can be classified into two categories: the shared and the unshared. Shared cultural presupposition originates from the intersection of both cultures, which can be understood by people of different cultural backgrounds. Unshared cultural presupposition originates from the cultural differences, which is the communication obstacle that needs to be solved. Only when a translator distinguishes them well and applies proper translation approaches, can he produce translations of high quality. Based on the translation of the self-help book “Make More, Worry Less” and the research that has been done, this report summarizes four approaches on dealing with the cultural images in the text of self-help type, which are literal translation, explanatory translation, culture replacement and culture deletion. Literal translation is deployed to deal with the culture image which contains shared cultural presupposition. As to the culture image which contains unshared cultural presupposition, translator can pick one from the other three strategies. Explanatory translation means explaining the culture image after literal translation, which contains two methods: in-text explanation and annotation. Culture replacement means replacing the unfamiliar and obscure culture image with familiar image of the target language. Culture deletion means revealing the presupposition meaning of the culture image while deleting the image. These four strategies can be well deployed to solve the problems caused by cultural presupposition in the text of self-help type. They have pragmatic meanings to improve the quality of translations. Key words：Cultural Presupposition; Text of Self-help Type; Translation Approaches ﹀
分类号：	H059/H315.9
论文总页数：	191
参考文献总数：	20
馆藏号：	017/M2013(0304)
公开日期：	2013-06-08

2013-06-07

面向信息处理的现代汉语数词及常用涉数结构研究.颜秦进

链接

题名：	面向信息处理的现代汉语数词及常用涉数结构研究
姓名：	颜秦进
学号：	1001210942
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2013-06-07
外文题名：	A Research on Numerals and Numeral Structures in Comtemporary Chinese for Information Processing
关键词：	信息处理数词涉数结构自动识别现代汉语语料库
外文关键词：	Information Processing Numerals Numeral Structures Automatic Recognition Contemporary Chinese Corpus
论文摘要：	︿数词是汉语词汇系统中不可缺少的组成部分。在现代汉语中，数词虽然通常被认为是小词类，但其与特定词类结合形成的涉数结构种类丰富，且往往承载了巨大的信息量。面向信息处理的数词及涉数结构研究，对于自然语言处理系统（如搜索引擎和机器翻译系统）性能的提高有着重要的意义。然而，已有的相关研究多服务于汉语学习和教学，少数虽针对计算机处理，但研究对象多局限在数字和常用数词的范围内，且过于关注这类表达形式的处理方法、算法等技术细节，没有充分利用传统语言学研究成果。本文以现代汉语中的数词及常用涉数结构为研究对象，以研究对象的计算机处理为研究目的，尝试将传统语法分析和信息处理实践结合起来，其主要内容包括研究对象的分类描述、识别规则构造和自动识别实验。作者先以传统的汉语语法研究成果为指导，结合对北京大学《人民日报》语料库、北京大学CCL现代汉语语料库、国家语委现代汉语通用平衡语料库的考察，提出了考虑到语言工程实际需求的涉数结构层级式分类，并通过实例对这些类别的结构进行了初步揭示。在此基础之上，作者以类似正则表达式的形式，构造了七十余条识别规则，然后基于封闭测试语料（取自《人民日报》语料）和开放测试语料（取自LCMC语料库和国务院政府工作报告），利用自定义的常用涉数结构标记，通过编写Python程序进行了自动识别、标注实验。结果显示，无论封闭测试还是开放测试都达到了较高的识别率和召回率。﹀
外文摘要：	︿ The word class numeral is an indispensable component of the vocabulary system of the Chinese language. Although generally regarded as a minor word class in contemporary Chinese, it can form into many kinds of numeral structures in combination with certain kinds of word classes. Besides, it is often information-intensive. Therefore, an information-processing-oriented research on numerals and common numeral structures is of great significance for the improvement of performance of natural language processing systems (such as search engines and machine translation systems). However, a larger part of existing researches on numerals are aimed at Chinese learning and teaching. On the other hand, the few computer-oriented researches narrow their object of study down to numbers and common numerals. In addition, the latter kind often focuses too much on the technical details of the processing methods or algorithms, neglecting findings from traditional linguistic researches. This paper mainly deals with numerals and common numeral structures in contemporary Chinese for information processing. It attempts to combine the perspective of traditional linguistics with that of information processing practice, with the classification, rule constructing as well as the automatic recognition of the objects of study as its major parts. Firstly, taking into the consideration of the practical requirements of language engineering, the author puts forward a hierarchical classification of numeral structures under the guidance of traditional linguistic theory. In addition, PFR Corpus, CCL Corpus, and CNCORPUS are explored to provide evidence for the usage of certain structures. Secondly, the author constructs more than seventy rules for the recognition of each category in a form similar to regular expression. Thirdly, the author conducts two experiments on automatic recognition of them by programming in the Python language. The experiments take the forms of close test and open test respectively: the former uses a subset of the PFR as the testing data (the close set), and the latter, that of LCMC Corpus and Reports on the Work of the Chinese Government (the open set). The results show high recall rates and precision rates in both sets ﹀
分类号：	H087/TP274
论文总页数：	80
参考文献总数：	60
馆藏号：	017/M2013(0368)
公开日期：	2013-06-07

面向信息处理的现代汉语时间表达研究.王雅慢

链接

题名：	面向信息处理的现代汉语时间表达研究
姓名：	王雅慢
学号：	1001210885
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	时间表达信息处理自动识别时间单元
论文摘要：	︿时间表达的研究对于信息处理领域具有十分重要的意义。在机器翻译过程中，准确判定时间表达的意义可以提高语言转换的准确率；而在信息提取系统的设计中，对时间信息的准确提取至关重要……日益智能化的实际应用都要求计算机准确的理解时间，并依此给出期望的处理结果。而时间表达的定位比单纯的“时间词”更具有实际意义。因此，本文所研究的时间表达不仅包括“时间词”，还包括时量结构、方位时间短语等更大的语法单位。面向信息处理的时间表达首先涉及到的是其范围和分类问题。计算语言学家们对于时间表达的分类较单一、粒度大，且一般只考虑“时间词”，未考虑短语的情况；相对而言，传统语言学家的分类比较细，但由于出发点不在于信息处理，如果放在信息处理领域，则适用性和可扩展性都较低。由于时间表达的界定和分类的局限性，目前时间表达的研究的拓展性也收到了限制。本文从信息处理的实际需要出发，对时间表达做了以下几方面的工作：（1）确定时间表达的分类和各子类，构建分类层级标记。综合前人研究成果，从信息处理角度出发，将时间表达分为基本型时间表达和复合型时间表达，基本涵盖了时间表达相关的词和短语。（2）基于已有研究成果，结合人民日报语料，借助类正则表达式制定和验证时间表达的构成规则集。（3）通过基于规则的方法对时间表达进行自动识别，支持一级标注和二级标注。在开放测试集结果上，准确率和召回率分别达到：94.07% 和92.25%，F值为93.15%，验证了研究的可行性和适用性。最后，本文对研究中存在的不足进行说明，并对未来努力方向进行了展望。﹀
分类号：	TP391/H059
论文总页数：	101
参考文献总数：	61
馆藏号：	017/M2013(0323)
公开日期：	2013-06-07

搜索广告中不相关广告识别算法的研究.吴春煦

链接

题名：	搜索广告中不相关广告识别算法的研究
姓名：	吴春煦
学号：	1001210898
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	不相关广告分类器交叉特征 LDA 逻辑回归
论文摘要：	︿当今互联网早已成为人们获取信息和社交等活动的重要平台，而搜索引擎是亿万网民最常使用的互联网工具之一，搜索引擎根据网民输入的查询词来分析网民的搜索意图从而帮助网民找到所需，同时也利用网民的搜索意图展示广告而获得收入。搜索引擎中的广告已经在影响着亿万网民的生活。然而大量不相关搜索广告的存在影响着网民的用户体验，也伤害了搜索引擎公司和广告主的利益，因此需要对不相关广告进行识别和过滤。搜索广告不同于一般文本，广告往往字数较少，因此存在包含信息少，存在歧义等问题，不能够完全照搬一般文本的处理方法，给不相关广告识别的任务带来了很大困难；另一发面，广告相关性不等同于广告相似性，它有着自己的定义。本文设计了一种基于分类技术的识别不相关广告的算法。本文的主要研究工作有: 1、将不相关广告的识别问题转换为分类问题，采用逻辑回归来作为分类器 2、提出了基于词的交叉特征和基于bigram的交叉特征，在字面这一层次来解决不相关广告的识别问题 3、提出了基于LDA的交叉特征，从而引入主题层面来改进不相关广告的识别问题，并且为了满足单机快速训练LDA的需求，设计了并行化LDA算法。本文设计的不相关广告识别与过滤算法在搜狗的广告系统实际应用中取得了很好的效果。﹀
分类号：	F713.82/TP311.13
论文总页数：	68
参考文献总数：	40
馆藏号：	017/M2013(0333)
公开日期：	2013-06-07

俄汉辅助翻译平台设计与实现.勾一博

链接

题名：	俄汉辅助翻译平台设计与实现
姓名：	勾一博
学号：	A0817369
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	信息科学技术学院
论文答辩日期：	2013-06-07
关键词：	俄汉翻译电子词典协同翻译分布式索引相似度计算尾词索引
论文摘要：	︿据调研统计，从外文翻译过来的信息占我军与外军交流信息总量的60%以上。目前军事信息翻译工作中存在人力资源有限、辅译装备滞后等问题，信息翻译的速度远远落后于信息获取的速度，亟需自动化的翻译辅助平台提高工作效率。本论文研究了计算机辅助翻译平台的相关技术，充分考虑俄汉翻译的特点，设计了CAN-Tree分布式索引方案，提出了基于分布式文件系统的存储索引算法，利用分布式哈希表解决大块语料的快速检索方法；在相似度计算技术方面，研究并比较了多种相似度计算算法，结合实际情况提出了一种新的词语相似度计算方法，克服了采用基于上下文的词语相似度计算算法带来的数据稀疏问题；对基于语义的语句相似度算法进行了改进，提出了一种可保证语句的分句或短语整体发生长距离移动后，仍能准确的计算其相似度的俄语语句相似度的计算方法。本论文充分挖掘俄语翻译人员数十年积累的翻译知识与经验，构建了适应业务的俄语翻译语料和词典数据库；提出并实现了网络化翻译校审协同工作流程；实现了分布式索引和相似度计算等技术；实现了基于尾词索引的图片词典检索技术和协同翻译技术；实现了词典的逆序查询、片段搜索和多语种查询等功能。实践证明该辅助翻译平台的技术创新是有效和可靠的，能大幅提高翻译人员的协同工作效率，有效地满足了业务工作急需，具有很好的效益和广阔的推广应用前景。﹀
分类号：	H085/H355.9
论文总页数：	61
参考文献总数：	50
馆藏号：	017/M2013(0722)
公开日期：	2013-06-07

基于图论和进化博弈论的聚类算法研究与应用.毕超

链接

题名：	基于图论和进化博弈论的聚类算法研究与应用
姓名：	毕超
学号：	A0717048
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	聚类特征提取相似度计算图论主集进化博弈论
论文摘要：	︿作为一种无指导的机器学习方法，聚类分析在模式识别、自然语言处理等领域得到了广泛而深入的研究和应用。本论文试图引入一种图论和博弈论相结合的聚类算法，经测试该算法表现出一定的有效性和健壮性。本文分析了图论和进化博弈论相互关联的基本理论：（1）基于图论的基本理论定义了主集概念，并将聚类问题建模为图问题，论证了图的主集与聚类中“类”的等价关系；（2）论证了寻找图的主集问题与求解二次型极大值问题的等价关系，即二次型极大值解的非零分量所组成的集合对应于图的顶点集合即为主集；（3）论证了模仿者动态方程的进化稳定解与二次型极大值的解之间的等价关系，得出模仿者动态方程的进化稳定解的非零分量组成的集合即为图的主集，也即聚类分析所得的“类”。据此，论文实现了图论和进化博弈论相结合的聚类算法，以划分思想进行图顶点的聚类，以进化博弈论模仿者动态方程模拟聚类问题并求出进化博弈的稳定解，因稳定解的非零分量对应主集，不断从图中去除主集对应的顶点和边，在新产生的图中继续寻找主集并去除，直到图的顶点、边集为空，聚类完成。为了评价和验证基于图论和博弈论的聚类算法的有效性，本文先是采用美国加州大学欧文分校的机器学习数据库（UC Irvine Machine Learning Repository）的标准数据集对算法进行了验证，取得了较好的效果；为进一步验证算法的可拓展性，本文将该算法应用于气候变化领域的全球地表气温分区模拟试验，取得了与联合国政府间气候变化专门委员会第四次评估报告基本相一致的研究结论，体现了该算法具有一定的适应性和有效性。﹀
分类号：	TP181/TP311.13
论文总页数：	77
参考文献总数：	38
馆藏号：	017/M2013(0716)
公开日期：	2013-06-07

基于机器学习的Twitter名人分类研究.辛洁

链接

题名：	基于机器学习的Twitter名人分类研究
姓名：	辛洁
学号：	1001210922
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	Twitter 名人 SVM N-gram LDA 分类
论文摘要：	︿ Twitter是继博客之后兴起的一种新的社交网络平台，支持短文本信息（tweets）发布和分享，具有速度快和实时性等特点，时至今日已经成为人们用于表达思想、传播内容和交流学习的重要平台。本文采用机器学习的方法对Twitter名人的分类问题进行了研究，并在真实的Twitter数据上进行了实验和评测。本文首先结合以往论文和实际的Twitter数据建立包含10种类别的Twitter名人类别体系，具有较强的客观性和实用性。并选取1000个Twitter名人进行类别标注，用于后续实验。接着利用Kwak数据集和Twitter API获取分类所需要的语料，主要有三类：用户所属的Twitter列表（lists）、用户的自我介绍（bio）和用户发布的短文本信息。然后采用n-gram和潜在狄里克雷分布（LDA）两种方法从语料中抽取特征。n-gram特征尝试了一元（unigram）和二元（bigram），并使用TF-IDF计算特征权重。LDA特征尝试了两种训练方式和多个主题数目训练LDA模型，从文本中抽取隐含主题。经过对多种分类算法的对比选用支持向量机（SVM）作为Twitter名人分类的模型，并选用libsvm和liblinear两个实现SVM的开源软件作为实验工具。最后对不同的SVM分类模型、特征和数据源及其组合进行了多组对比实验。实验结果表明，分类模型线性支持向量机，特征来源Twitter列表和tweets组合，特征种类unigram和LDA特征组合，取得了最佳分类效果，精度达到0.765，微平均F1值达到0.667。本文研究的独创性在于：1.研究了一个较新且较为困难的任务，Twitter名人的多分类问题；2.将Twitter列表的特征应用于Twitter名人分类的机器学习方法中，并取得了优良效果。本文的研究成果能够对Twitter等社交网络中的名人进行有效分类，可用于社交网络的名人推荐、名人搜索、用户自动标注等应用中。﹀
分类号：	TP274/G250.7
论文总页数：	70
参考文献总数：	32
馆藏号：	017/M2013(0353)
公开日期：	2013-06-07

在线收藏夹标签推荐服务的设计与实现.陈慧挺

链接

题名：	在线收藏夹标签推荐服务的设计与实现
姓名：	陈慧挺
学号：	1001210530
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	在线收藏夹社会化标签标签推荐
论文摘要：	︿在互联网时代，人们在网上浏览信息的时候会采用收藏夹来记录自己喜欢、常用的网站或页面，以方便今后再次使用。传统收藏夹在给我们提供便利的同时，也还存有不足。当人们没有携带自己的电脑，将很难使用保存在本地的书签。随着人们收藏的书签增多，收藏夹变得臃肿，其中的类别划分也开始变得模糊杂乱，不利于人们便捷的查找所需的页面。本文就是在这样的背景下，提出在线收藏夹的标签推荐功能，辅助用户更好的管理自己的书签信息。本文设计爬虫抓取大量真实的中文书签数据，并对书签数据中标签、域名、书签的分布关系进行了分析，提出基于域名信息、标签间文本相似度等因素的标签推荐算法，改善了传统的基于统计的标签推荐算法。此外还与基于最大熵模型的推荐算法进行了对比实验。最终还将两种算法组合在一起，获得了推荐效果上的大幅提升。采用基于统计和最大熵的混合算法，使推荐结果可以跟随系统内用户标注的标签动态调整，并且具有良好的实时性。本文通过工程实现和真实数据实验，最终验证了通过对域名信息、标签相似度等因素的考虑，可以提高改良原有的基于统计的标签推荐算法。﹀
分类号：	TP393.4
论文总页数：	47
参考文献总数：	0
馆藏号：	017/M2013(0066)
公开日期：	2013-06-07

基于主动学习策略的中文名址切分标注研究.喻洁琼

链接

题名：	基于主动学习策略的中文名址切分标注研究
姓名：	喻洁琼
学号：	1001210965
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	切分标注条件随机场主动学习不确定度量委员会投票信息密度
论文摘要：	︿随着互联网技术与通信技术的飞速发展，信息抽取逐渐成为信息情报研究领域里最有意义的自然语言处理技术之一。任何现实事件总是发生在一定的时间和空间范畴内。这里，空间信息指的就是中文名址信息在自由文本中的表达。收集中文名址信息，对中文名址切分标记方法进行深入挖掘，探索以自然语言处理技术挑战中文名址信息处理的可能发展途径，为信息提取系统的研发以及相关联的情报分析工作提供支持具有重要意义。对中文名址来说，并无可直接使用的已标注语料来训练模型。在大量已标注中文名址语料获得相对困难，而大量未标注的中文名址获取相对容易的情况下，考虑使用主动学习策略来进行中文名址切分标注研究以改善学习性能。本文的研究内容主要包括：1) 在对中文名址特点分析的基础上，制定中文名址地址段切分规则和类型标记集用以标记训练语料。2) 将中文名址切分标注问题转化为序列标注问题，使用条件随机场模型来训练语料，采用分词标注一体模型以减少分词阶段的错误对标注阶段的累积影响。3) 采用主动学习的策略对所有语料的信息度进行计算，并按其不确定值进行降序排列，迭代地挑选出学习模型最不确定的语料交由人工标注，以达到使用尽可能少的标注语料训练模型以获得尽可能高准确率的目的。使用的主动学习策略有：a) 基于最优N序列熵方法；b) 基于委员会询问的序列投票熵方法；以及c) 基于信息密度的主动学习方法，并针对三种策略进行研究，实现其算法细节，完成与随机标注组的对比实验。实验表明，将主动学习策略应用到中文名址切分标注任务上，结合制定的中文名址切分标注的切分规则和类型标记集，在采用分词标注一体模型的基础上，用2000条随机抽取的中文名址语料进行测试，1) 切分的准确率为96.84%，召回率为95.81%，F值为96.32%；标注的准确率为96.02%，召回率为95.00%，F值为95.51%；2) 与随机标注语料的对比实验，发现在达到相同准确率的前提下，使用主动学习策略，人工标注量可减少50%，通过对比实验证明采用主动学习策略来进行中文名址切分标注任务比传统随机方法挑选语料进行标注训练具有更好的性能以及泛化能力。﹀
分类号：	TP391
论文总页数：	75
参考文献总数：	50
馆藏号：	017/M2013(0386)
公开日期：	2013-06-07

基于Neo4j的地名本体构建研究.杨洁

链接

题名：	基于Neo4j的地名本体构建研究
姓名：	杨洁
学号：	1001210951
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	地名本体地名本体关系 Neo4j 机器学习一阶谓词逻辑推理引擎
论文摘要：	︿从地名信息入手，本文基于本体形式化地描述了地名的语义特征和空间特征；然后在地名本体和地名本体关系基础上，开辟了一条存储、查询和解析地名本体信息的新方法。该方法使用非关系型数据库（Neo4j图数据库）存储地名本体，然后构建了北京市地名本体数据库。Neo4j图数据库克服了传统关系型数据库处理数据间关系转换的不足。例如，查询地址串“北京市华清嘉园”，Neo4j图数据库可以反馈“北京市”经“管辖关系”到“海淀区”经“管理关系”到“成府路”经“相邻关系”到“35号”经“等价关系”到“华清嘉园”的详细地址串。本文主要包括以下基本内容：第一章　在总结了地名本体的研究背景和意义及国内外有关本体构建、关系识别、本体推理的研究现状之后，本章探讨了地名本体研究中需要解决的地名信息组织问题，并且提出了本文的解决方案。第二章　基于本体的定义和内涵，本章探讨了本体的描述语言与构建原则，并重点讨论了OWL本体描述语言的结构和本体构建工具。在地名本体存储方面，本章还详细分析了Neo4j图数据库的特点及其与关系数据库的不同之处；然后结合本体和关系体系讨论了地名本体关系类型及地名本体关系推理工具。第三章　基于地址要素设计的地名本体概念模型，本章实现了地名本体结构到Neo4j图数据库结构的映射，并在此基础上完成了基于Neo4j的地名本体形式化表达。为了获取构建地名本体的基础数据，本章还重点探讨了地名本体关系语料（约60000对）的构建、最大熵模型及支持向量机模型，并在此基础上设计了地名本体关系识别的特征模板，然后对比分析了最大熵模型和支持向量机模型的实验效果。第四章　在详细探讨了北京市行政区划地名本体构建过程之后，本章实现了地址串中不同地名实体之间的地名本体关系到Neo4j图数据库结构的转换；然后基于Prolog语言和一阶谓词逻辑规则设计的地名本体关系推理引擎，弥补了机器学习方法识别地名本体关系的不足，实现了不同编程语言、工具之间的接口和数据传输。针对地名本体关系的推理，本章基于深度优先搜索和自动回溯算法还重点讲述了四类关系的推理过程。第五章　针对Neo4j地名本体数据库，本章基于查询问题探讨了查询地址的解析和标准化处理问题：首先基于条件随机场模型对查询地址进行解析，然后基于地名文本和地名拼音的混合模式对解析之后的地名进行相似度计算并完成地名的标准化处理，最后在此基础上实现房源信息的匹配任务。﹀
分类号：	TP311.13/TP18
论文总页数：	104
参考文献总数：	67
馆藏号：	017/M2013(0374)
公开日期：	2013-06-07

基于抽象语法树的程序形式化转换研究.马玉超

链接

题名：	基于抽象语法树的程序形式化转换研究
姓名：	马玉超
学号：	1001210772
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
外文题名：	Research on Transformation from Programs to Formal Method Based on Abstract Tree Analysis
关键词：	程序形式化转换抽象语法树程序模式 C 形式化方法OE
论文摘要：	︿软件质量一直以来都是软件工程领域重点关注的问题，保证软件质量或提高软件质量是研究和工程与追求和探索的目标。1995年，Bertrand Meyer表示，软件产业的进步将依赖于一种有效的数学方法，即形式化方法。形式化方法在软件工程中的应用，利用形式化数学证明方法对软件程序进行验证分析，从而保证软件程序的正确性。软件程序形式化包括程序形式化转换和程序形式化验证，其中第一步就是转换，即软件程序向形式化方法的转换。目前，研究领域对程序形式化转换问题多停留于理论研究阶段，也没有成熟的商用软件，转换过程依赖于程序员，自动化程度不高。本文在分析了软件程序形式化转换方法的优缺点后，借助于JavaCC工具，采用基于抽象语法树的C源程序向形式化方法OE转换的技术，并设计实现了一个程序形式化转换原型工具。首先，利用基于“巴克斯诺尔范式（BNF，Backus-Naur Form）”的建模方法，论文对C语言和形式化方法OE中的程序元素和程序结构用抽象语法树进行建模。通过“逻辑等价”关系的推理，验证程序形式化转换模型的正确性。在程序模式层次上对两个模型进行比较，并列出了程序形式化转换需求和分析了程序形式化转换的特性。其次，根据“程序模式”和“抽象语法树（AST，Abstract Syntax Tree）”分析方法，按照转换需求和转换特性，制定了C-OE程序形式化转换的流程，包括预处理、JavaCC初编译、程序模式分析与抽取、规则转换、分析组装目标抽象语法树和后处理六个步骤；然后，根据程序模式的抽象语法树表示，制定了C-OE程序形式化转换的规则；采用扩展的程序形式化转换算法，基本实现转换模型中程序模式的全部覆盖。最后，从软件工程角度，利用J2EE、Spring、Hibernate等开发框架实现了C-OE程序形式化转换原型工具，并对对程序形式化转换原型工具进行评估测试，验证原型工具的功能性和实用性。目前，本文研究的转换模型可以正确有效地解决程序形式化转换问题，本文设计的转换原型工具可以覆盖GJB 5369标准C程序的转换，并实现了转换过程的自动化。本文所阐述的内容为程序形式化验证奠定了基础，并为解决程序形式化转换问题提供了有意义的思路和实现方法。﹀
分类号：	TP311.5/TP301.2
论文总页数：	95
参考文献总数：	43
馆藏号：	017/M2013(0241)
公开日期：	2013-06-07

面向交互式机器翻译的汉英平行文本Chunk对齐.吴胜兰

链接

题名：	面向交互式机器翻译的汉英平行文本Chunk对齐
姓名：	吴胜兰
学号：	1001210906
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
外文题名：	Chunk alignment of Chinese-English parallel text for interactive machine translation
关键词：	语块对齐机器翻译平行文本双语对齐
外文关键词：	Chunk Alignment Machine Translation Parallel Corpus Bitext Alignment
论文摘要：	︿平行语料库的Chunk自动对齐是计算语言学领域的一个重要研究课题，对基于实例的机器翻译系统、人工翻译等领域有着重要的应用价值。国内外针对双语Chunk对齐已经有了一些研究，主要分为基于规则的方法以及基于统计的方法两大类。Chunk对齐可以在双语句子的Chunk划分结果上进行，也可以让划分与对齐同步进行，互相促进。本文在总结了前人工作的基础上，针对面向交互式机器翻译的汉英平行文本进行了Chunk对齐的研究，提出了几种双语Chunk划分与对齐的方法。基于单词间粘合度与松弛度的双语Chunk划分方法针对句子的Chunk划分，本文提出了单词间粘合度与松弛度的概念，分别代表单词间连接的紧密程度和松散程度，Chunk划分总是倾向于在连接松散程度高的单词间划开。本文尝试使用了单语组块划分、命名实体、依存分析、语言模型等6种不同的特征来计算单词间的粘合度与松弛度，其中包括双语划分的镜面约束特征，该特征可以利用GIZA++词语对齐矩阵让互译句子对的Chunk划分互相约束、互相促进。对每种特征权重进行了参数寻优实验，对各种特征的组合进行了对比实验。本文对Chunk以及Chunk划分结果提出了评分计算方法，并提出了Chunk划分的三种搜索算法：贪心、回溯、动态规划。对三种算法的优劣进行了分析，并分别进行了对比实验。基于双语N-BEST Chunk划分与Beam-Search算法的Chunk对齐方法本文对Chunk Pair相似度以及Chunk对齐结果评分提出了计算模型。针对双语句子对的Chunk对齐，本文利用Beam-Search算法，在Chunk划分的结果上进行最佳Chunk对齐的搜索。对于互译的中英文句子，首先利用本文提出的Chunk划分的回溯搜索算法，可以获取前N个最好的Chunk划分结果。在中英文句子的N^2种Chunk划分组合中，搜索出最佳的Chunk对齐结果。本方法在多种Chunk划分结果中搜索Chunk对齐，在综合了划分结果与对齐结果的基础上进行全盘考虑，在二者之间寻找最佳的平衡点。基于实词对应与相邻Chunk Pair扩展的Chunk对齐方法本方法在GIZA++词语对齐的基础上进行Chunk对齐，首先将词语对齐矩阵中所有的实词（主要包括名词、动词、形容词）看做Chunk Pair，将所有剩余单词看做翻译为空的Chunk，形成初始Chunk对齐结果。在初始对齐的基础上，利用随机选择相邻Chunk Pair进行合并、或者将一个Chunk Pair周围的空Chunk纳入其中进行扩展的方式，修改初始对齐，得到新的对齐结果。在搜索算法中，不断重复随机扩展与合并的过程，直到满足终止条件。对于每次产生的新对齐结果，若其对齐分数比之前的更高，则插入到可自动按照对齐分数排序的N-BEST列表中，搜索结束时， N-BEST列表的第一名即最佳对齐结果。实验结果表明，和MTTK词到短语对齐工具相比，使用本文提出的算法进行Chunk对齐效果更好，准确率和召回率在70%左右。本文还进行了根据对齐结果抽取高质量Chunk Pair的实验，可以根据不同的Chunk Pair评分阈值控制抽取结果的准确率和规模，最好的准确率为94.46%。﹀
外文摘要：	︿ Automatic Chunk Alignment of parallel corpus is an important research field in computational linguistics. It has important application value for Example-Based Machine Translation (EBMT), human translation, and other domains. There was some research work for Bilingual Chunk Alignment, which can be divided into two categories: the rule-based methods and the statistical methods. In general methods, Chunk Alignment is performed after the individual Chunk Partition for each bilingual sentence, however, the Chunk Partition and Chunk Alignment can be performed at the same time to improve performence for each other. The thesis summarizes some previous work and proposes some methods for Bilingual Chunk Partition and Chunk Alignment in interactive machine translation oriented parallel Chinese-English corpus. Bilingual Chunk Partition Method Based on Degree of Adhesion and Degree of Relaxation between adjacent words For the Chunk Partition in a sentence, 2 new concepts, the Degree of Adhesion and the Degree of Relaxation between two adjacent words were proposed in this paper to separately measure the close extent and lose extent between the two words, and Chunk Partition always tends to split from two adjacent words with a higher Degree of Relaxation. The using of different features was studied to find a better way to calculate the Degree of Relaxation and Degree of Adhesion. There were 6 kinds of features, include Monolingual Chunking, Name Entity, Dependency Parsing, Language Model, and The Mirror Constraint of Bilingual Chunk Partition, which makes use of GIZA++ word alignment matrix to let the Chunk Partition of both source sentence and target sentence benefit each other. Experiment was performed to find the best weight and best combination of different features. In this paper, the scoring method was proposed for a single Monolingual Chunk, as well as for Bilingual Chunk Partition. Besides, for Chunk Partition, three search algorithms were proposed: greedy, backtracking and dynamic programming. Moreover, Analysis of the pros and cons of the three algorithms and comparative experiments were presented. Chunk Alignment Method Based on N-BEST Bilingual Chunk Partition and Beam-Search Algorithm In this paper, the similarity scoring method for Chunk Pair was proposed. For Bilingual Chunk Alignment, the Beam-Search algorithm is used to find the best alignment in the state space. In the first step, the backtracking algorithm is used to get the best top-N Chunk Partition of the sentence. For source and target sentence, there are N^2 different kinds of combinations of Chunk Partitions, for each combination, the search algorithm is performed to find the best alignment, and the final output is the best alignment among them. The searching from multiple Chunk Partition combinations takes the Chunk Partition and Chunk Alignment into overall consideration, aiming to find the best balance. Chunk Alignment Method Based on corresponding content words and Adjacent Chunk Pair Expansion This Chunk Alignment Method is based on high-quality GIZA++ Word Alignment. Firstly, all the content words (mainly noun, verb, adjective) are considered as Chunk Pairs, and all the words left in both side of the sentences are regarded as empty Chunks which have no translation in the target sentence, in this way, the initial alignment is formed. A random adjacent Chunk Pair Merging or Empty Chunk Expansion is performed upon the initial alignment to generate new alignment. In the search algorithm, the Merge and Expansion is operated iteratively until the final terminate condition is satisfied. For each newly generated alignment, if their alignment score is higher than before, then the alignment will be inserted into an N-BEST list, which can be sorted by the alignment score automatically. When the search algorithm is over, the top-1 in the N-BEST list will be output as the best alignment. Experiment shows that the algorithm proposed by this paper outperforms the MTTK Word-To-Phrase alignment tool. The precision and recall are about 70%. The experiment of high-quality Chunk Pairs Extraction is also performed, different threshold can be used to control the extracted Chunk Pair quality and scale, the best accuracy is reported as 94.46%. ﹀
分类号：	TP391
论文总页数：	105
参考文献总数：	57
馆藏号：	017/M2013(0338)
公开日期：	2013-06-07

疾病自动编码系统的研究与开发.黄家驹

链接

题名：	疾病自动编码系统的研究与开发
姓名：	黄家驹
学号：	10917199
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	疾病分类疾病编码电子健康纪录 ICD-10
论文摘要：	︿随着计算机处理临床信息的应用的不断增长，电子病历、电子健康记录也在日益普及。在将这些医疗信息电子化过程中，很关键的一步是把医生用非受限的自然语言记录的疾病名称、症状描述等对应到一套统一的编码体系中，使得计算机能够识别这种疾病。除了临床记录外，疾病数据库也需要这样一套编码体系。否则同一疾病的不同别名，不同拼写，或者条目里增加了一些额外的修饰词，以及各种各样的错误都会使得一种疾病无法被正确识别。如果需要同时用到不同的疾病数据库，这个问题会更严重，因为不同数据库疾病命名的规范不同，使得跨数据库的疾病数据整合更为困难。如果做好了疾病编码，这些问题都将迎刃而解。通过疾病编码可以轻易识别和整合疾病。疾病编码使得计算机能更好的处理疾病数据，使得医疗机构、科研机构、医疗保险机构、政府等能够更好的使用疾病数据，能也能让病人自己更好的看到他们所患的疾病和人们更为熟悉的疾病之间的关系。不同数据库的疾病可以整合来做基于统计的富集分析。因为数据量的增大，统计分析会更有价值。疾病编码由人工来完成，不但费用大，速度慢，而且错误率高。如果能让计算机自动完成，并且保证高正确率，就会很有价值。本文探讨的就是用一个基于规则的系统来实现疾病自动编码。本文选择的疾病体系是国际疾病分类第十版（ICD-10），因为ICD-10是近几年使用最广泛的疾病体系。本文除了使用拼写纠正、标记化、否定词查找、袋子模型和向量空间模型等自然语言处理中常用的方法外，还使用Google检索结果作为筛选拼写错误的方法，以及将查询和编码条目中的词标注角色，并建立和使用了多个本体词典，当查询词和编码条目中的词不同时，通过比较相同角色的词在本体词典中的关系来确定其能否匹配。最后，本文对实验结果进行了比较和分析。使用了这些技术后大幅提高了疾病自动编码的准确度。﹀
外文摘要：	︿ As the application of medical information processing by computers grows, more and more health records are made standardized and electronic. A very important step in digitize health records is to transform disease names and symptoms in unstructured natural language into codes of unified coding system, so that the information can be processed by computers. Unified coding system is also needed by disease databases. Without that, aliases, alternative spellings, addition of words, or any mistakes will make a disease unable to be correctly recognized. The problem is even more severe when working with different databases, as they have different naming conventions, which makes databases integration more difficult. If disease coding is properly done, the above problems will be solved. Diseases can be easily identified and integrated by their codes. Disease coding facilitates computers to process disease information, so that medical agencies, research agencies, health insurance agencies, and governments can better utilize that knowledge. Different databases can be integrated for statistical analysis. Enrichment analysis will become more informative and accurate because of the increased data size as the result of disease database integration. However, manual coding of disease is not only time consuming but error prone. It is invaluable if coding can be done automatically by computers with high accuracy. In this paper, we are going to discuss a rule-based automatic coding algorithm. International Classification of Disease 10 (ICD-10) is chosen as the coding system because of its widespread use. The coding algorithm not only incorporates techniques often used in natural language processing like spelling correction, tokenization, negative phrase search, bag of words model, and space vector model, it also tags the roles of the words in query and target, and uses a series of ontology dictionaries that will match words of the same role based on their ontological relationship. Finally, this paper compares and analyses the result of automatic coding. The technologies used in this paper greatly increased the accuracy of coding. ﹀
分类号：	R366/R195.4
论文总页数：	55
参考文献总数：	26
馆藏号：	017/M2013(0584)
公开日期：	2013-06-07

基于WEB的语义元数据辅助构建平台关键技术研究与实现.郑德举

链接

题名：	基于WEB的语义元数据辅助构建平台关键技术研究与实现
姓名：	郑德举
学号：	1001211036
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	中国科学技术信息研究所
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	语义元数据语义标注知识获取
论文摘要：	︿近年来，随着信息技术的不断发展，以数字资源形式组织的文本越来越多，信息内容也不断增长。这些信息的内容中，相当一部分包含了各个领域的文本内容知识。对于这些领域文本的内容的获取和处理，逐渐成为对知识进行有效组织和利用的关键。目前，对于领域文本内容的获取和处理，主要依靠人工劳动，比较费时费力。尽管在自动处理领域取得一些进展，但是，仍然没有有效的解决知识的获取、处理、存储和检索等一系列问题。对于以上情况，本文针对语义元数据辅助构建平台的相关关键技术进行了研究和开发。实现了对知识的获取、处理、标注一体化的过程，一方面，标注后的内容可以更好的丰富本体知识，另一方面，标注后可以形成高质量的标注语料，可以提高知识处理中的效率，经过试验，也表现出较好的效果。本文首先针对知识密集型文本片段的获取和处理进行了研究。挖掘知识密集型文本片段的文本特征，以本体语料库中的领域语料为例，研究了属性抽取的相关技术。对网络领域文本的内容获取进行了研究。针对网络领域文本的组织和特点，开发了处理网络领域文本的相关工具。经过试验，可以对网络领域文本进行高效的处理。研究了语义标注相关技术。为了更好的组织领域文本，开发了针对领域文本的语义索引，并且主要针对不同文本内容来源的领域文本进行语义标注，实现对知识的有效组织。为了体现技术上的可行性和有效性，开发了语义元数据辅助构建平台。针对语义元数据的获取、处理、组织等一系列过程，开发了一套适用于领域文本的相关工具。实现了语义元数据的构建、标注一体化。最后，本文介绍了所构建的语义元数据的相关应用。以知识检索为例，介绍了领域文本在构成语义元数据后，可以用作领域知识的检索，是对领域文本知识内容的有效组织。﹀
分类号：	TP274/TP393.4
论文总页数：	73
参考文献总数：	33
馆藏号：	017/M2013(0444)
公开日期：	2013-06-07

领域本体在线辅助构建系统的研究与开发.郭志军

链接

题名：	领域本体在线辅助构建系统的研究与开发
姓名：	郭志军
学号：	1001210613
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	中国科学技术信息研究所
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-07
外文题名：	The Research and Development of Domain Ontology Online Construction System
关键词：	本体自动构建本体学习
外文关键词：	Ontology Automatically Construct Ontology Learning
论文摘要：	︿随着计算机技术的飞速发展和互联网技术的不断成熟，各种知识和信息急剧增加，资源越来越丰富，获取也越来越方便。如何有效地组织和管理信息与知识，使其便于共享与利用，已成为一项迫切而重要的研究课题。本体是一种能在语义和知识层次上描述信息系统的概念模型，它能够支持人机之间或机器之间的信息交换、知识共享以及重用；同时，本体也是一种知识表示模型，它表达了概念的结构和概念之间的关系等领域中固有的特征。早期本体的构建工作是通过人工完成的，然而手工构建本体的方法由于构建周期长、成本高、对领域专家依赖性强，加之构建方式耗时且易错，已成为本体工程发展的瓶颈之一。近年来，国内外研究本体的学者正着力于自动或半自动本体构建方法的研究。本文主要研究本体自动构建相关技术，并在相关研究的基础上开发本体辅助构建平台。本文运用自然语言处理技术提出了基于文本内容自动构建本体的方法。本体的构建主要包括创建、进化和管理三个方面。在本体创建方面，本文实现了将三种结构化词表转换为基础本体，同时实现了从专业教材和专业文献提取领域知识关系；在本体进化方面，实现了一种基于互联网的本体自动进化方法，弥补了基于单一文本进行自动构建存在信息量有限、更新慢的缺陷；在本体管理方面，实现了多人在线的辅助校对和版本管理。实验表明，本文开发的平台，能够自动构建并有效管理领域本体。﹀
外文摘要：	︿ With the rapid development of computer technology and Internet technology, knowledge and information increase dramatically and become easy to access. How to effectively organize and manage knowledge and information, and to facilitate the sharing and use, has become an important research topic. Ontology is a kind of concept models that could describe system at the level of semantics and knowledge. It is able to support the exchange of information between human and machines, as well as it makes it possible to share and reuse knowledge. The Ontology is also a kind of knowledge representation model, which expresses the inherent characteristics of the concept structure and relations between concepts. Early Ontology construction work is done manually. It has become a bottleneck in the development of Ontology engineering because of time-consuming, error-prone, long cycle, high cost and dependence on the domain experts. In recent years, the research is focused on automatic or semi-automatic Ontology construction methods. This paper studies the Ontology automatically construct related technologies, as well as the development of Ontology construction platform. This study proposes a method to construct Ontology automatically with text-based content using NLP technologies. Ontology construction includes creation, evolution and management. For Ontology creation, this paper proposes a method of converting three types of structured vocabularies to Ontology, while introduces a method of extracting knowledge relations from professional textbooks and professional literature. For Ontology evolution, this paper implements a method of automatically learning knowledge from Internet. This evolution method improves the expansion speed and ways. For Ontology management, this paper designs a system which supports multiplayer online maintenance and version management. The experiments show that the platform developed in this paper can automatically construct and effective manage Ontology. ﹀
分类号：	TP391/G252.7
论文总页数：	70
参考文献总数：	42
馆藏号：	017/M2013(0127)
公开日期：	2013-06-07

面向技术创新的铝业本体自动构建研究.王明程

链接

题名：	面向技术创新的铝业本体自动构建研究
姓名：	王明程
学号：	10917438
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	中国科学技术信息研究所
论文答辩日期：	2013-06-07
关键词：	技术创新铝行业自动构建
论文摘要：	︿信息高速发展的知识经济型社会，不断快速增长的信息资源，在给人们工作和生活带来极大方便的同时，也给人们希望能够快速有效的获取所需信息带来了干扰。本体作为特定领域的概念化的明确说明，通过概念的定义和概念之间的关系来确定概念的精确含义，能够表达出较为精确的、可共享的知识。论文是基于面向企业的技术创新项目，对铝行业这一特定领域进行本体自动构建研究，利用铝行业公认领域知识及专业文献资料，对铝行业生产流程中所涉及的岗位、设备、工艺和安全等关键要素进行知识重构，将概念互联成多位一体的立体网状结构，通过对概念间语义关系提取的实现，不仅起到整合和关联铝行业领域知识的作用，也为铝行业提供了一种优质高效的信息检索服务模式，并为其它领域本体的建设提供了可参考和借鉴的模型。﹀
分类号：	TP391/TP311
论文总页数：	80
参考文献总数：	0
馆藏号：	017/M2013(0598)
公开日期：	2013-06-07

面向专利文献的中文句法分析与错误检测研究.李艳萍

链接

题名：	面向专利文献的中文句法分析与错误检测研究
姓名：	李艳萍
学号：	1001210706
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	中国科学技术信息研究所
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-07
关键词：	专利文献中文句法分析错误检测 N变元方法语义信息引入
论文摘要：	︿在全球化和知识经济的背景下，专利是促进企业竞争和国家发展的一个及其重要的知识源泉，针对专利文献的研究需求也日益提高。目前的研究集中于专利文献的应用，如专利检索、专利情报分析、专利知识获取等，但这远远不够，研究粒度过粗、语义理解差距较大等问题，迫切需要从宏观转向微观。鉴于句法分析能够自动理解专利文本的优势，本文以专利文献为研究对象，采用不同的方法工具对专利文献进行句法分析，并实现对专利标注语料的自动错误检测，同时引入语义方法提高分析专利文献的准确率。本文首先选取MSTParser(v 0.5)作为中文专利标注语料的依存句法分析器。以专利文献为基础，对其进行词性和依存标注，并将已标注专利作为实验语料，训练、测试、评价模型，观察分析器在处理专利文献与通用题材文本时的性能差异，探索训练语料规模对实验结果的影响，分析原因并总结专利文献的句法特点，为后文的专利错误检测和依存关系优化提供理论支持。同时，本文采取依存和短语两种句法分析方法进行对比实验，得出专利文献句法研究更适于采用依存分析方法。鉴于分析器生成模型的性能除了与分析器本身设计的算法和特征选择机制有关，还与原语料的标注质量有关，错误的专利句法标注会对分析器评价结果产生不同程度的负向影响。因此，本文设计实现专利文献错误检测机制（基于N变元方法），自动检测专利语料中标注不一致的POS元和依存对。在分析检测结果的基础上，对常见依存标注错误类型进行归纳总结，并分别应用非边缘后处理和POS泛化方法提高POS和依存错误检测的查准率和召回率。另外，本文提出引入语义信息的方法，以提高MSTParser分析专利文献的性能。用反映词语义信息的语义编码替换原专利语料预处理分析器输入，使其能够自动携带不同词类的语义信息进行专利模型训练。实验结果显示，采用各语义信息引入方法后，LA等评价指标几乎均有一定提高，各依存关系的F值也有整体提升，证明了引入语义信息的合理性和可行性。﹀
分类号：	TP391/G306
论文总页数：	69
参考文献总数：	0
馆藏号：	017/M2013(0186)
公开日期：	2013-06-07

基于WEB的多领域语料标注加工系统的设计与开发.肖铮

链接

题名：	基于WEB的多领域语料标注加工系统的设计与开发
姓名：	肖铮
学号：	10917481
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	中国科学技术信息研究所
论文答辩日期：	2013-06-07
关键词：	语料标注 WEB 自然语言处理专业知识库建设
论文摘要：	︿专业领域语料库对于专业领域文献的自然语言处理以及专业文本内容与意图的深层把握非常重要。虽然目前语料库建设理论与技术已经非常成熟，基于本体的领域知识库研究也风生水起，但是由于受专业知识与语言处理的双重限制，目前国内外针对专业知识进行语料库建设的研究仍然较少，除本文作者的导师刘耀及其研究团队针对中医药古文献进行的研究外，鲜有报道。为了进一步探讨专业语料库的设计思想及原理，本文在导师刘耀及其研究团队已有研究成果的基础上，参考了国内外相关领域的研究工作，对WEB环境下的文本标注以及多领域适应性等问题和关键技术进行了深入研究，并设计和实现了一个基于WEB的多领域语料标注加工系统（MDCA，Multi-Domain Corpus Annotation System）。该系统支持专业领域语料的在线标注、专业语义词典的在线校对编辑、专业语义词典自动生成以及词典管理等多项功能。系统生成的词典、标注结果等语料资源通过导入、导出等功能支持资源重用，并通过用户管理功能支持领域专家的有效参与和互相协作，在领域专家的帮助下对专业语义词典进行优化，保证对领域内容所做标注的正确性和权威性。本系统在具体实现时采用了基于词典的语料切分标注方法，既能满足基于WEB的语料处理需求，也能实现多领域适应性目标。为了进一步验证系统的可行性和正确性。本文针对不同领域的专业文本进行了标注实验。该实验通过通用切分标注技术与专业词典以及本系统所创新的专业语义词典分别搭配进行语料标注，结果表明利用各领域固有知识结构及体系进行标注的思想与方法，不仅可以有效地建立知识连结的轨迹，而且可以在语料库中建立该领域的知识架构，更加便于专业领域的知识发现与挖掘，对专业领域知识资源建设具有重要意义。﹀
外文摘要：	︿ Domain-specific knowledge corpus is very important for domain knowledge natural language processing and document understanding. Though current corpus construction theories and technologies have achieved great success, and there have been many researches on ontology based domain knowledge construction theories and technologies, few have been conducted at home and abroad on doman-specific knowledge corpus constructions. There are two reasons for this situation. First, domain-specific knowledge related work requires professional knowledges and understanding beyond the strength of normal people. And second, most of the cureently available NLP tools are oriented for general-purpose language phenomano studies. My thesis advisor Mr. Liu Yao had led a team on researches of ancient Chinese traditional medicines carried out a few system development experiments about domain-specific knowledge corpus construction and processing theories. So at his direction, I carried on with the relevant researches to promote domain-specific knowledge corpus annotation system design and development work for cross-domain adaptability and WEB-assisted communicative network platform for specialist to work on. So, this thesis focuses on the research of web-based multi-domain corpus annotation with the accomplishment of a demo system named MDCA, which stands for Multi-Domain Corpus Annotation system and features in online specialist corpus annotation, specialist syntactic dictionary automatic generation and dictionary management and etc. functions. The import and export capability enables recyclable usages of corpus resources in this organic system which produces annotated corpus resources and meaningful syntactic dictionaries. Besides, with the implementation of user management subsystem, domain specialists could be actively enrolled in the task of dictionary editing and share resource with each other and cooperate online, which means the final result of syntactic dictionary based annotations could be correct as well as authoritative. Finally, for the verification the test of the above works, this thesis conduct annotation experiments on different domain-specific materials. The experiment results confirm the viability and correctness of the theories behind the system, showing the system is able to connect the innate knowledge relations unable to be show by using existing tools and sytems as well as empowers specialist editors to work on his professional domain online with dictionary resources automatic conversion, editing, export and privilege management tools provided by MDCA. Thus, we’re able to harvest domain-specific syntactic dictionaries and domain-specific annotated results derived automaticlly from these carefully crafted cherishable domain-specific resources which are fully fleged for an organic lively recycle subsystem embed by the MDCA system developed along this thesis composition, and hopefully to provide meaningful study materials for domain-specific knowledge understanding and automatic large-scale domain-specific corpus construction. ﹀
分类号：	TP391/H087
论文总页数：	92
参考文献总数：	0
馆藏号：	017/M2013(0600)
公开日期：	2013-06-07

面向中学英语教学的知识库自动构建研究.黄毅

链接

题名：	面向中学英语教学的知识库自动构建研究
姓名：	黄毅
学号：	10917201
论文语种：	chi
专业：	软件工程
公开时间：	1年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	刘耀
导师1单位：	中国科学技术信息研究所
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2013-06-07
外文题名：	A Study on Automatic Construction of Knowledge Base for English Teaching in Secondary School
关键词：	英语教学知识库本体
外文关键词：	English Teaching Knowledge Base Ontology
论文摘要：	︿随着国际交流与合作的快速发展，英语在我国中学教学中的地位也越来越突出。当前英语教学资源主要是基于网页和数据库，存在着资源重复、关联性差，缺乏统一术语规范，不能有效的交流与共享等问题。如何有效的解决知识共享、知识表示和知识推理等问题，使教学资源更好的服务于英语教学，这决定着教学资源的质量和学生学习的效率。本体作为特定领域概念化的明确说明，通过概念的定义和概念之间的关系描述来确定概念的精确含义，从而表达出较为精确的、可共享的知识。运用本体的思想来构建语言知识库，可以从根本上解决教学资源共享、重用以及个性化学习等问题。语言知识库应用于中学英语教学，对实现信息化教学具有重大的科研价值和应用价值。因此本文主要研究了以下三个问题：1）在中学英语教学的背景下，怎么利用语料库语言学相关理论指导英语教学？2）怎样利用本体的思想有效的刻画词、短语和句子的概念、属性和关系？3）概念模型建好以后，怎么利用现有资源实现语言知识库的自动构建？本文分为六个章节，绪论章节主要介绍了本文的研究背景、主要工作内容以及组织结构。第二章介绍了语料库语言学相关理论以及本体和知识库的相关背景，并列举了国内外相关的研究现状，指出构建语言知识库的必要性。第三章从语言学角度出发，运用本体的思想构建词与短语的概念模型，并设定了相关属性和关系，构成语言知识库的基础框架。第四章通过引入句法分析，将句子与词、短语等概念有效的关联起来。第五章以英语语法的“主谓一致”作为案例，通过有效分类、规则设定和关系构建解析并扩充了这一知识点，证实了自动构建语言知识库方法的可行性。第六章对本文的研究工作进行了总结，阐述了本文研究的局限性以及未来的研究方向。﹀
外文摘要：	︿ With the rapid development of international communication and co-operation, English teaching is playing a more and more important role in secondary school education. The sharing of English educational resources nowadays is mainly based on the Internet, or relational databases etc. where issues such as repetitive resources, lack of relevance and common terminology standards, and insufficient communication and sharing etc. exist. Solving the issues in knowledge sharing, knowledge representation and knowledge reasoning to adapt the resources to English education, is decisive in the quality of educational resources and students’ studies. Ontology acts as a clear and conceptualized interpretation of domain knowledge. It determines the relationship between concepts and their definitions to ensure the accurate meaning, as a result of which knowledge becomes sharable and more accurate. By building a knowledge base in terms of ontology-based thoughts, issues such as educational resource sharing, replication and personalized teaching etc. can be addressed. The application of knowledge base in the English teaching in secondary schools is of great value in both research and application in respect of the realization of information-based teaching. Therefore the following three issues are discussed hereafter: 1) How to guide the English teaching practice with relevant theories in corpus linguistics under the background of secondary school English teaching? 2) How to define the concepts, properties and relationships of words, phrases and sentences efficiently with ontology-based thoughts? 3) How to use existing resources to automatically build the knowledge base after the construction of conceptual model. There are six chapters in this paper. The introduction chapter mainly discusses the research background, research content and organization structure. The second chapter introduces relevant theories in corpus linguistics and ontology, and lists current studies home and abroad in related fields. The third chapter uses ontology-based thoughts to build a conceptual model as the foundation of knowledge base. The fourth chapter introduces syntactic analysis to effectively associates words, phrases, sentences and other concepts. The fifth chapter quotes the example of subject-predicate consistency found in English grammar, analyze and expand upon this knowledge using effective classifications, rule establishments and relationship structures. It verified the feasibility of applying ontology-based thoughts to automatically build the knowledge base for English teaching. The last chapter summarizes the research work of this paper, states the limitation of the research and sets forth the directions of future studies. ﹀
分类号：	G434/H087
论文总页数：	70
参考文献总数：	0
馆藏号：	017/M2013(0585)
公开日期：	2014-06-07

2013-06-06

科普书籍英译汉中注释的研究.杨德林

链接

题名：	科普书籍英译汉中注释的研究
姓名：	杨德林
学号：	1001210946
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	朱源
导师1单位：	中国人民大学外国语学院英语系
论文答辩日期：	2013-06-06
关键词：	注释读者群分类加注原则科普翻译翻译策略
论文摘要：	︿在英汉书籍的阅读中，读者缺少相关的背景知识和专业知识，会带来理解上的障碍和信息传输过程中的知识“缺失”。如果译者在译文中不对这种“缺失”的信息进行注释，则原文中的很多相关信息不能在译文中得到“还原”，最终会影响知识的传播效率。因此，译者在翻译过程中有必要采取注释等翻译策略，将原文中“压缩”的信息“解压缩”出来，或是补充相关的信息，并呈现给读者，提高读者的阅读兴趣和知识的传输效率,让译作更具可读性和易读性。科普性书籍中添加的注释数量多，覆盖面广，其中很多注释具有代表性。因此，译者或学者有必要对其中的注释类别、原则和规律进行研究,为后续的科普英译工作者在添加注释时提供借鉴和指导。但是，目前国内外相关研究状况表明,科普英译汉中的注释研究一直得不到应有的重视，笔者在收集注释研究的过程中，发现相关论文或文献非常少。许多学者的相关研究，也仅仅是停留在比较浅显的层面，很少有学者对其进行较为系统的研究。鉴于此，在本文的研究中，笔者结合相关文献，立足于自身的科普翻译实践,对其中的注释进行系统性地研究，内容涉及到注释的定义和分类，注释与读者、作者、译者的关系，科普翻译中注释的类型、规律等问题，期望为后来的科普翻译工作者提供借鉴。本文的研究重点是研究科普书籍英译汉中注释的原则、类型和规律，难点是科普书籍中注释的类型划分，以及结合注释的类型和面向的读者群来提炼科普翻译英译汉中注释的规律。关键词：注释；读者群分类；加注原则；科普翻译；翻译策略﹀
外文摘要：	︿ In the reading of English-Chinese books, the readers' deficiency of relevant background knowledge and professional knowledge will lead to obstacles in understanding and “loss” of knowledge in the process of information transmission. If the translators fail to make annotations on this kind of “missing” information, a lot of related information in the original text will not be “restored” in the translated text and it will eventually affect the transmission efficiency of knowledge. Consequently, in the course of translation, it's necessary for the translators to adopt annotation and other translation strategies to “uncompress” the “compressed” information in the original text, or to supplement related information and represent it to the readers to improve the their interests in reading and the transmission efficiency of knowledge, as well as making the translation more readable. The annotations added in popular science books are high in quantity and have a wide coverage; among them, many annotations are representative. Therefore, it’s necessary for translators or scholars to conduct research on the type, principle and rule of the annotations therein, so as to provide reference and guidance for the later popular science English-Chinese translators. However, the status of current domestic and foreign research on this topic indicates that the study of annotation in popular science English-Chinese translation has not received duly attention all the time. During the process of collecting researches on annotation, the author finds out that there are very few related papers or literatures. The existing researches on the topic by many other scholars only stay at superficial level and few scholars have conducted systematic research on the topic. In view of this, the author combines related literatures in the research of this paper and conducts systematically research on the annotation on the basis of personal popular science translation practice. The content involves the definition and classification of annotation, the relationships between annotation and readers, writers and translators, and problems concerning the type and rules of annotation in popular science translation and so on, with the view to provide reference for future popular science translators. The focus of study of this article is the principle, classification and rule of annotations in popular science book English-Chinese translation and the difficult points are the classification of annotation in popular science book and extracting the rules of annotation in popular science book English-Chinese translation in combination with the type of annotation and oriented readers. Key Words: annotation; classification of readers; principles of note-making; popular science translation; translation strategy ﹀
分类号：	H315.9/H059
论文总页数：	0
参考文献总数：	0
馆藏号：	017/M2013(0370)
公开日期：	2013-06-06

2013-05-31

交互式写作（翻译）辅助系统.刘强

链接

题名：	交互式写作（翻译）辅助系统
姓名：	刘强
学号：	1001210736
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2013-05-31
外文题名：	Interactive Writing (Translating) Assistant System
关键词：	写作辅助多引擎模型模型融合参数训练交互式系统增量模型
外文关键词：	write assistant multi-engine model model combination parameter training interactive system incremental model
论文摘要：	︿计算机辅助翻译系统已经广泛的应用于职业翻译公司和各公司的翻译部门，而现在普遍使用的是机助人译模式，其核心功能是翻译记忆和术语管理。这种模式已经很难对相关翻译和写作人员的工作质量和效率有新的提高了，而全自动机器翻译在短期内也不太会有突破性进展。在这样的形势下，本研究经过考查和分析，提出并实现了一种以译员的翻译输入过程为主导，辅之以多个写作（翻译）预测引擎的融合提示的交互式写（译）辅助系统，探索人机交互写（译）系统对与相关工作人员的工作效率的影响。我们设想这个系统通过人当前的键盘输入、已完成的输入信息、多个预测引擎、用户的选择日志、实时增量引擎来完成工作。为了实现这样的系统，我们首先研究了N元语言模型，利用从双语网站抓取并经过处理的语料构建了多个N元语言模型的预测引擎；接着我们引入了词性标记序列辅助预测引擎，帮助其他引擎缩小预测范围；然后我们又分析了非连续搭配的抽取方案，提出并实现了一种基于CWB 语料库引擎的非连续搭配抽取算法，构建了搭配预测引擎；此外本文还对实时的语法分析及预测引擎、实时更新的增量预测引擎进行了初步讨论和实验。在上面工作的基础上，本文研究了几种多引擎融合的方法，通过比较和分析，选择了基于对数线性模型的多引擎融合方案，并根据本系统的特点，设计了针对交互式写（译）辅助系统的参数训练方法。此外，我们还设计了一个针对于交互式写（译）辅助系统的评测标准，并研发了模拟交互式写作过程并给出评测的程序。经过不同的参数设置和分组对比实验，结果表明本研究中的系统可以有效的提高相关写作和翻译工作人员的工作效率，是一个值得进一步拓展研究的领域。﹀
外文摘要：	︿ Computer Aided Translation(CAT) had already been widely used in professional translation companies or their translation sections. For most of their CAT work, they use the "Machine Aided Human" mode which focuses on translation memory, terminology management and leaves little room for any further improvement of quality and efficiency of related workers. While full automated high quality machine translation sees no breakthrough in a short time. In such a situation, through investigation and analysis, in this study we proposed and implemented an interactive writing (and translation) assistant system prototype. In this system, with a multi-engine driven predictor, the human user guided the writing and translating work by typing and selecting candidate prediction options. In this research, we build such a system in the hope of finding the impact that the interactive writing (translation) assistant system would have on the writer and translator's work efficiency. We assume that this system would work under such conditions: current key press, already inputted information, several prediction engines, selection log, and real-time incremental prediction engine. To build such a system, we firstly studied the N-Gram language model and build several N-Gram based prediction models using corpus fetched from some bilingual websites and properly preprocessed. Then we adopted the POS sequence prediction engine which help to reduce the search space of N-Gram model. After that, we analyzed the discontinued collocation extraction method and implemented a CWB(IMS Corpus Workbench) engine based discontinued collocation extraction algorithm and build our collocation prediction engine. Beside this essay also did basic discussion and experiment about the real-time grammatical information prediction by partial parsing and the real-time incremental prediction engines. Base on the previous work, this essay choose the log-linear model as the multi-engine combination strategy by comparing several system combination methods. And according to the feature of the system, this study designed a parameter training algorithm for interactive writing (translation) assistant system. Besides, we also design an evaluation standard specially for interactive writing (translating) assistant system and developed a corresponding program for the evaluation. Through different parameter settings and several group of experiments, the results indicate that the interactive writing (translation) assistant system can effectively improve the work efficiency of related workers. And our research field is of great value for further exploration and research. ﹀
分类号：	H085/TP311.1
论文总页数：	56
参考文献总数：	40
馆藏号：	017/M2013(0211)
公开日期：	2013-05-31

2012-12-10

武侠小说中粗话的英译——以《鹿鼎记》为例.康宇

链接

题名：	武侠小说中粗话的英译——以《鹿鼎记》为例
姓名：	康宇
学号：	10817192
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-12-10
关键词：	武侠小说粗话翻译策略《鹿鼎记》翻译
论文摘要：	︿ “粗话”一般指粗俗或含有侮辱性或攻击性的语言。世界上几乎所有语言中都有粗话，它是语言中不可分割的一部分。即使是在物质文明和精神文明都达到很高程度的今天，人们也经常在某些场合使用粗话来表达自己特定的情绪、情感。但是粗话翻译的研究却异常稀少。翻译时，通常的是避而不译，或是使用比原文“干净”的多的词来替代。然而，如果不能恰当地翻译粗话，无法准确地表达原文含义和当事人的情绪，甚至会影响上下文衔接。在文学作品中，粗话是作家刻画人物的重要手段之一，不译或“净化”会在很多时候不能呈现原文的风格1。因此近年来粗话的翻译问题受到了一些学者的注意，单其昌，冯庆华等人在其编写的翻译教程中把粗话翻译作为独立专题进行论述2。武侠小说是中国文学史上不可忽视的一部分，其受众遍及各个年龄段，有些名家作品甚至被译成多种文字，远传海外。《鹿鼎记》是著名武侠小说作家金庸先生的代表作之一，其英译本也受到国外读者的好评和欢迎3。小说中使用了很多粗话来刻画书中人物的性格特征，或表达人物喜怒哀乐的情绪，偶尔还作为小说情节的推动剂。因此本文选择其译本为语料基础，对粗话的翻译进行研究。本论文采用定性分析的办法，以《鹿鼎记》中的粗话为例，对其翻译策略和方法进行分析和总结，指出其翻译的合理之处和有待讨论之处，目的是找出粗话翻译的解决方法和应对策略，同时为武侠小说的翻译提供借鉴意义。研究结果表明:(l)中英文粗俗语有各自的分类和特点。(2)中英两种语言间的粗话不是一一对应的关系，翻译时要注意译文是否与原文涵义相符。(3)《鹿鼎记》中粗话的翻译采用具体策略有:直译、意译、省译、增译以及直译意译结合法。(4)粗话的翻译，不仅要考虑词义的对等，还要考虑说话人的隐含意义。（5）粗话出现的具体语境和载体对其翻译有很大的影响，翻译时须注意。本文针对粗话的英译做出尝试性研究，在实践中对翻译方法提供了一些新的观察视角，以期对不同文化特色的语言翻译工作做出贡献。﹀
外文摘要：	︿ Vulgar language，generally means those vulgar or insulting, offensive words. Nearly each language in the world contains vulgar language. It is an integral part of a language. Even now, when the material and spiritual civilization has reached so high a degree, people cannot avoid the utterances of vulgar language under some circumstances for expression of their special feeling and emotion. However, researches on the translation of vulgar languages are very few. Most translators usually choose to avoid translating vulgar language or replace it with much “cleaner” words. Without a proper translation of vulgar language, the translated text cannot accurately deliver the original meaning and figures’ emotion, and may even affect its cohesion. In literary works, vulgar language is one of the writer’s important ways to depict character. Avoiding translating and purifying them may make the character lack specific features in language; as a result, the original style could not be presented perfectly. Therefore, the translation of vulgar language attracts more and more scholars’eyes in recent years, for example, Shan Qichang and Feng Qinghua discussed it as an independent subject in their translation handbooks. Martial arts fiction cannot be ignored when review the history of Chinese literature. Its audience is nearly throughout the ages, and some famous works have even been translated into many languages, spreading overseas. Lu Ding Ji is one of the representatives of Mr. Cha, the famous author of martial arts fiction, and its English version has also gained welcome and praises form English readers. In this book, the author uses a lot of vulgar language to describe characters, or express passions of figures, occasionally even as the promoting agent of the plot. So this thesis chooses Lu Ding Ji and its translated version as the basis and makes a study of the translation of vulgar language in martial arts fiction. The study is conducted in a qualitative way, taking the vulgar language in Lu Ding Ji as examples, analyzing and concluding the translation strategies and methods, discussing the translation, aiming at finding out ways of dealing with vulgar language, meanwhile providing a little help for translating martial arts fiction. Findings indicate: (1) Chinese and English vulgar languages respectively have their own classifications and Characteristics. (2) Chinese and English vulgar expressions are not one-one equivalents, so in translating something important, attention should be paid to whether the version accords with the original meaning. (3) The methods adopted in Lu Ding Ji for translating vulgar language include: literal translation, liberal translation, addition, omission and combination of literal and liberal translations. (4) The translation of vulgar language should consider both relevance of literal meaning of the word and the implicit meaning of the speaker. (5) Different context and text type that the vulgar languages appear can influence the translation. This study aims at researching on the specific field of C-E translation, touching upon vulgar language, providing new insight into the translation practice and contributing to the translation of different cultures. ﹀
分类号：	H059/H087
论文总页数：	64
参考文献总数：	46
馆藏号：	017/M2012(758)
公开日期：	2012-12-10

2012-12-08

交传口译的任务型自主学习研究.吴桂兰

链接

题名：	交传口译的任务型自主学习研究
姓名：	吴桂兰
学号：	10817391
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-12-08
关键词：	自主学习口译训练交传任务型学习
外文关键词：	self-learning interpreter training consecutive interpretation task-based learning
论文摘要：	︿随着全球经济一体化进程的加快，中国与世界在政治、经济和文化方面的交流日趋频繁，市场对口译人才的需求也越来越大。口译人才的培养需要关注自主学习的效果。受到口译教学课时限制，学生在课堂上训练的时间往往不足，需要课后自主学习和训练，才能巩固和提高口译技能。对于如何有效的进行口译自主学习，在为数不多的研究文献中，研究者给出的多是宏观的理论指导，而没有可实施的具体方案。本文旨在设计一套口译自主学习模式，用于规范口译自主学习过程。这套模式能够帮助口译学习者进行学习管理，提高其口译学习效率和口译水平，使之具备职业译员的能力。本文以任务型学习理论和自主学习理论为基础，结合交传口译对译员的要求以及自主学习的要求，在已有的任务型教学模式基础上设计出了任务型交传口译自主学习模式，并使用实验进行了验证。本文的结构和内容安排如下：第一章阐述本文的研究背景、研究的问题及意义、论文的主要工作，概述了本文的结构安排。第二章为文献研究与分析，调研了自主学习、任务型语言教学、口译教学等理论，了解了交传口译对译员的要求，对译员能力和译员知识结构进行了总结。第三章介绍了笔者提出的任务型交传口译自主学习模式，对该模式的实施流程和实施要点进行详细论述，并给出了实际的例子予以说明。第四章详细讨论了教学实验的设计与实现，包括实验对象、研究工具、实验结果分析以及实验结论。笔者采用口译测试的统计分析和调查问卷的方法验证了任务型自主学习模式的有效性。第五章调研了一些典型的口译训练系统，并提出这些系统存在的缺陷。然后详细叙述了基于网络环境的任务型口译自主训练系统设计与实现。第六章是总结和展望，主要是对全文的研究工作进行总结，并对下一步需要研究的重点和将来工作进行展望。﹀
外文摘要：	︿ With the accelerating process of global economic integration, great exchanges occur between China and the rest of the world in the fields of politics, economy and culture, which results in massive demand for interpreters. Interpreter-training is inseparable from self-learning because classroom training is insufficient. Interpreting learners need a great deal of after-class practices to enhance their capacity of interpretation. A few research literatures gave theoretical guidance on autonomous learning of interpreting, but no implementation program was provided. Therefore，research on self-regulated interpretation learning is of practical significance. A normative model is designed for self-regulated interpretation learning. In compliance with the principles of the model, interpreting learners can improve his learning efficiency and interpretation capacity. The task-based model for consecutive interpretation self-learning is designed based on theories of task-based learning and self-regulated learning, and it also considers the requirements of consecutive interpretation and self-learning. The content of this thesis is organized as follows: Chapter one introduces the research background, research issues and the structure of the thesis. Chapter two provides literature study and interpretation systems research. Chapter three describes the task-based model for self-regulated consecutive interpretation learning. Chapter four centers on design and implementation of the self-learning experiments. Experimental results are analyzed in this chapter by statistical method and questionnaire. Chapter five discusses the design and realization of the author's interpretation training system. Chapter six draws a brief conclusion of this thesis and introduces the plans of study in the future. ﹀
分类号：	H059
论文总页数：	92
参考文献总数：	63
馆藏号：	017/M2012(762)
公开日期：	2015-12-08

修辞学视角下的英文产品宣传册劝说功能研究.缪行

链接

题名：	修辞学视角下的英文产品宣传册劝说功能研究
姓名：	缪行
学号：	10917338
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-12-08
关键词：	产品宣传册劝说功能语步修辞诉求
外文关键词：	product brochures persuasive texts moves rhetorical appeals
论文摘要：	︿产品宣传册是介绍产品信息的宣传品，是厂商向消费者推广产品的重要手段之一。其篇幅比报纸和杂志广告更长，可以呈现更丰富详细的信息。为了让潜在消费者产生购买欲望并付诸行动，产品宣传册除了在表层文字上精雕细琢，给消费者以强烈冲击之外，更要借助隐藏在文字之下的劝说策略达到推销目的。目前，在宏观语篇方面，已有修辞结构研究大多针对短文本广告，产品宣传册这种较长篇幅文本很少得到研究者的关注。在劝说策略方面，使用修辞分析理论研究广告劝说功能还不够系统完善，而针对产品宣传册的研究更几乎是空白。为了更好地研究产品宣传册这种长篇幅文本的劝说功能，本文首先从宏观角度将产品宣传册划分为若干部分，提出了产品宣传册特有的修辞结构；然后在微观层面上构建出适合分析产品宣传册的新修辞诉求框架；随后，本研究将宏观结构和微观策略相结合，得到诉求策略在各修辞语步中的分布规律；最后，为探析写作者如何针对不同产品特点使用不同的劝说策略，本文比较了家用电器宣传册和电子产品宣传册的修辞诉求策略分布特点。本研究发现：1）产品宣传册包含以下语步：品牌及产品识别—品牌及产品营销—定位产品/锁定目标消费者—详述产品信息—附加产品推销—鼓励反馈—法律声明。2）各个语步的修辞诉求分布呈现出不同的特点，写作者根据不同语步的交际功能，使用了不同的修辞诉求策略。3）产品宣传册虽然没有使用Connor修辞诉求框架中的全部修辞诉求方式，但却包含了该框架中没有出现的7种新修辞诉求方式。4）家用电器和电子产品宣传册中使用的修辞诉求既有共同也有差异，写作者根据这两种产品的特点采取了相应的劝说策略。本文的理论意义包括：提出了产品宣传册特有的篇章结构，丰富了商务语篇的修辞结构研究；构建了适合分析产品宣传册的新修辞诉求框架，对广告营销文本修辞分析方法进行了新的尝试。在实践意义方面，本文研究成果可以帮助广告营销写作者更好地掌握产品宣传册的劝说写作技巧，同时，也可以为产品宣传册写作教学提供有益的参考。﹀
外文摘要：	︿ Product brochures, longer than advertisements in magazines and newspapers, are a cost-effective means of distributing product information to large numbers of people. As a powerful marketing tool, product brochures not only contain glowing words but also convey persuasove message under an informative mask. Previous move-based research of promotional discourse has mainly focused on short text, with little or no attention given to long texts such as product brochures. In terms of persuasive strategies, early studies of marketing discourse, for the most part, have employed a qualitative approach, revealing very little about how rhetorical appeals interact with each other to perform a particular communicative purpose. ﹀
分类号：	H059/F713.8
论文总页数：	109
参考文献总数：	0
馆藏号：	017/M2012(771)
公开日期：	2015-12-08

E-learning在翻译公司的应用和探究.李艺峰

链接

题名：	E-learning在翻译公司的应用和探究
姓名：	李艺峰
学号：	10817220
论文语种：	chi
专业：	软件工程
公开时间：	2年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-12-08
关键词：	E-Learning 培训翻译公司教学设计
论文摘要：	︿随着翻译技术的发展，计算机辅助翻译软件在翻译公司的运用愈加普遍。翻译公司为了提升竞争力，相关翻译软件的培训自然成为了公司发展策略中非常重要的一个环节。对于翻译项目经理以及翻译人员的来说，他们的自身业务水平及工作效率更是决定了翻译公司发展的未来。如何更有效地对翻译项目经理及翻译人员进行计算机辅助翻译软件的培训，并且切实提高他们运用工具的能力，进而提高工作效率是公司领导、行业专家共同面临的问题。基于上述思想，本文从翻译公司的培训问题和需求出发，提出了基于E-Learning的培训设计和学习策略研究。建构主义、行为主义、成人学习理论为基于E-Learning的培训设计奠定了理论基础。针对E-Learning在翻译公司的应用，本文对应用要素、技术体系构建、学习资源体系构建以及运作体系等核心内容做了分析和探讨。本文首先针对E-Learning如何成功导入翻译公司进行探讨，介绍了在导入过程中需要考虑的要素。进而针对E-Learning平台功能及翻译公司实际需求，对E-Learning平台进行特定功能扩展。然后针对翻译公司常见的工具及问题，提出新的教学设计内容及合适的学习策略，以最大化提升培训的效果。最后为了确保E-Learning平台在翻译公司的应用中切实地解决了实际问题，提高了培训效果，构建了一套有效的管理执行和评估体系。全文的研究都是基于笔者在L翻译公司实习过程中所遇到的相关问题，并且在L公司实际参与了E-Learning应用的整个过程。实际应用表明，在翻译公司应用E-Learning是完全值得推广和探究的，无论是对于翻译公司或是相关从业人员，新型的培训方式的应用都有着无可估量的益处。﹀
分类号：	G434
论文总页数：	77
参考文献总数：	30
馆藏号：	017/M2012(759)
公开日期：	2014-12-08

基于统计的《红楼梦》两个俄译本诗歌翻译风格分析和比较.管佳珏

链接

题名：	基于统计的《红楼梦》两个俄译本诗歌翻译风格分析和比较
姓名：	管佳珏
学号：	1001210608
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-12-08
外文题名：	Statistics-based Analysis and Comparison of Poetry Translation Style of Two Russian Translations of A Dream of Red Mansions
关键词：	语料库翻译风格《红楼梦》俄译本诗歌
外文关键词：	Corpus Translation style Poetry of the Russian translations of A Dream of Red Mansions
论文摘要：	︿《红楼梦》有着极高的文学和社会研究价值，但是目前涉及《红楼梦》两个俄译本的翻译风格比较的研究较少，在研究方法上也多为例证式的定性研究，几乎没有进行过定量研究。基于语料库的翻译风格研究近年来发展迅速，从定量角度研究翻译风格，不但可以验证定性研究的结论，而且可以揭示和研究更多的从未被发现或研究的课题。本文致力于使用定量研究方法对《红楼梦》1958年俄译本和1995年俄译本诗歌的翻译风格进行比较和分析，开辟《红楼梦》俄译研究新的研究视角，同时为语料库方法在诗歌方面的应用研究提供新的语言特征指标。本文首先对文本从词汇、句子、语篇层面进行了通用文本特征研究分析。在词汇层面，统计和分析了类符数、形符数、标准化类符/形符比、词汇密度、词长分布。在句子层面，统计和分析了以语料库整体为对象的平均句长。在语篇层面，统计和分析了连接词的使用情况；然后从诗歌结构和诗歌韵律层面专门针对诗歌文本特征进行了研究分析。在诗歌结构层面，统计和分析诗行句长的相关系数、每首诗歌诗行数目的协方差、每首诗歌平均句长的协方差。在诗歌韵律层面，统计和分析了诗韵类型占比情况。通过研究，我们发现过去定性式《红楼梦》俄译本研究结论并不完全适用于俄译本诗歌部分。总体而言，在通用文本特征方面，1958年俄译本语言更为精简、凝练，用词量较小，但是词语的运用更为多样化，倾向于使用短词和短句。1995年俄译本具有更强的可读性和流畅度，词汇量较大，不过词语的运用不如1958年俄译本多变，倾向于使用较长的词语和句子。此外，1995年俄译本有较多的注释，更注重向读者传递信息的完整性。在诗歌特征方面，1958年俄译本诗行的句长、每首诗歌的诗行数目和每首诗歌的平均句长都更接近原文，在结构上也更为接近原文；在诗歌韵律方面，1958年俄译本末重韵占比和1995年俄译本相比较低，韵律相对柔缓，而1995年俄译本则在韵律上相对有力。﹀
外文摘要：	︿ The novel, A Dream of Red Mansions, is of great value for the literary and sociological research. But currently there are very few translation style studies involving two Russian translations of A Dream of Red Mansions. The research approaches are restricted as well: practically qualitative methods supported by examples instead of quantitative methods. However, the corpus-based translation style research has been developing rapidly in recent years, and the quantitative research in the field of translation style can not only help verify the existent qualitative conclusions, but also contribute to revealing and studying more research topics, which haven't been revealed or studied. This article is dedicated to the quantitative comparison and analysis of the poetry translation styles of the Russian translations of A Dream of Red Mansions. And we manage to provide a new research perspective for the Russian translation study of A Dream of Red Mansions and new indicators for corpus-based poetry studies. Firstly, we carry out an analysis of the generic text features from three aspects: words, sentence and overall passage. At the word level, we conduct a statistical analysis of tokens, types, standardised type/token ratio, lexical density and word length distribution. At the sentence level, we analyze the mean word length of the overall poetry corpus of each translation. And at the passage level, a statistical analysis is carried out in order to compare the usage of the conjunctions in the translations. Next we do a particular research on the poetry features from two aspects: structure and rhythm. On the aspect of poetry structure, we compute and analyze the following data: correlation coefficient of the length of poetic lines, covariance of the number of poetic lines of each poem, covariance of the mean sentence length of each poem. And on the aspect of poetry rhythm, we calculate the proportion of each rhythm type. In the research, we arrived at a conclusion that the past overall qualitative research conclusions of the Russian translations of A Dream of Red Mansions are not fully applicable to the poetry part of the translations. Altogether, through the analysis of the generic text features, we found out that the language of the translation published in 1958 is more concise and condensed and the vocabulary is smaller, while the use of words is more diversified. Besides, the translator of the translation published in 1958 inclines to use shorter words and sentences. As for the translation published in 1995, the readability and fluency is better, and the vocabulary is bigger, but the words turned out to be less diverse. And the translator tends to use longer words and sentences. In addition, there are more explanatory notes in the 1995 edition, which indicates that the translator of this edition tends to focus on the completeness of information transmitted to readers. On the side of poetry characteristics, the three indicators (including the length of poetic lines, the number of poetic lines of each poem and the mean sentence length of each poem) of the translation published in 1958 are all closer to those of the original edition, that is to say, the structure of the 1958 edition is more similar to that of the original one. In addition, the proportion of end rhyme is lower in the 1958 edition, reflecting that the language of the translation is more gentle, while that of the 1995 edition is relatively intense. ﹀
分类号：	H087/H059
论文总页数：	99
参考文献总数：	78
馆藏号：	017/M2012(687)
公开日期：	2012-12-08

用户文档的可用性研究—以智能手机用户手册为例.凌昱

链接

题名：	用户文档的可用性研究—以智能手机用户手册为例
姓名：	凌昱
学号：	10917280
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-12-08
关键词：	技术写作用户文档可用性
外文关键词：	Technical writing User's Manual Usability Manual
论文摘要：	︿用户文档是技术文档的重要组成部分，在产品运营中扮演着越来越重要的角色。在人们的日常工作、学习和生活中，不可避免地要用到用户文档，甚至有时用户文档是人们学习一种产品的唯一途径，因此，对用户文档来说可用性至关重要，可用性高的用户文档不仅可以降低用户的阅读难度，增加用户的满意度，同时也可以降低企业的维护成本。但是与欧美、日本等IT发达地区相比，国内用户文档的可用性相对较低。以我们日常生活中常见的家电说明书为例，有调查显示，有50%以上不使用说明书的消费者认为说明书内容太过繁琐，少数消费者甚至看不懂说明书。由此可知，开发出可用性高的用户文档就显得非常有必要，而如何评定用户文档的可用性也成了其中非常重要的一项工作。为了更加有效、准确地评价用户文档的可用性，本文首先对国内主流的八款智能手机的用户文档进行了用户测试；然后在借鉴国内外相关研究的基础上，本文较为完整地总结了影响用户文档的因素，并结合专家深度访谈和层次分析法，建立了适用于用户文档可用性的评价指标集合；接着，本文在该评价指标集合的基础上，设计了以iPhone和小米手机的用户文档作为实证对象的用户文档可用性评价实验，总体包括实验设计、数据统计分析和评价结果三个部分；最后本文对该评价指标中的内容具体性、完整性、行距和行长度对用户文档的可用性影响进行了验证实验。本论文的创新在于：较为完整地总结了影响用户文档可用性的因素，并将其总结归纳成“易查找性”、“易理解性”、“易用性”和“实用性”四个维度，然后将每个维度进行一级指标的划分，细化出二级指标，建立用户文档可用性的评价指标集合，最后运用专家打分法和层次分析法计算了各项指标的权重；选取了 iPhone 和小米手机用户文档作为用户文档评价指标集合的实证对象，并对两个文档进行了细致的分析；运用 RST 关系结构统计文档中修辞结构的比例，有效地验证了文档内容的具体性、完整性确实对用户文档的可用性存在影响。﹀
外文摘要：	︿ As an indispensable part of technical documents, user manuals play a more and more important role in product development and marketing. And the usability of user manuals is especially important. User manuals with high usability can not only decrease the difficulty for user's reading, but also increase users' degree of satisfaction, and decrease the maintenance cost for the company at the same time. However, compared to areas such as Europe and America where IT industry is developed, The usability of internal user manuals is rather low. Taking the instructions of electrical household appliances as an example, an investigation shows that more than 50% of consumers who do not use instructions consider that user manuals are too redundant and a few of them even cannot understand the instructions. Thus, it is neccessary to develop user manuals with high usability. Meanwhile, how to evaluate the usability of manuals become particularly important. ﹀
分类号：	TN929.5/TP311.5
论文总页数：	93
参考文献总数：	0
馆藏号：	017/M2012(768)
公开日期：	2015-12-08

E-learning产品本地化翻译及可读性研究.王培方

链接

题名：	E-learning产品本地化翻译及可读性研究
姓名：	王培方
学号：	10817362
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-12-08
关键词：	E-learning产品可读性分析翻译策略
论文摘要：	︿随着全球经济的扩张，电子学习(E-learning) 已经成为众多跨国公司内部培训的首选方式，而E-learning产品本地化翻译质量的好坏，则会直接影响到E-learning产品的使用效果。然而，目前对E-learning产品的本地化翻译并没有形成统一的翻译方法。E-learning产品本身包含文件比较复杂，对于E-learning产品的翻译要处理多方面的文本风格或格式，其中主要包括E-learning产品的文本承载文件、多媒体脚本文件以及UI文件三大类。由于E-learning产品的这种特殊性，应当将其从软件翻译的类别中分离出来单独研究其翻译策略。本文在第二章探讨E-learning本地化翻译与软件翻译的不同，同时也会分析与E-learning翻译有相似性的网站翻译、教科书翻译以及字幕文本翻译目前的研究现状。在对前人研究结果进行分析的同时，本文会指出E-learning产品翻译的特别之处，并在第三章本文着重探讨E-learning产品的翻译方法，从E-learning产品的文本文字特点、组成结构、用户需求等方面入手对E-learning的翻译方法和策略进行总结并举例说明。本文会在第四章采用工程实践的方法，采用第三章的翻译策略，用一个完整的E-learning本地化翻译流程对选定的E-learning产品进行翻译，并介绍一种可以用来检测E-learning译后文本翻译质量的方法，即用可读性分析及相关方法证明翻译之后E-learning产品能够保持足够的可读性。在第五章，对完成的E-learning本地化译后文本进行用户调查。通过用户问卷调查，证明第三章E-learning翻译方法在保证E-learning产品译后质量中起到的作用，并与第四章的可读性分析进行印证。最后，本文将论述文章研究的不足之处并指出几个未来可能的研究方向。虽然可读性在衡量译文是有诸多不足之处，但与用户调查相结合时，能够从一定程度上反映译文的质量。对于本地化行业内的E-learning产品翻译来讲，本文具有很强的实用性。本文的贡献主要在于两个方面：第一，采用可读性分析的方法对E-learning产品的翻译项目进行评价；第二，提出了系统的E-learning产品本地化翻译策略。﹀
分类号：	G434/H059
论文总页数：	78
参考文献总数：	0
馆藏号：	017/M2012(761)
公开日期：	2012-12-08

演示文稿翻译策略研究.王薇

链接

题名：	演示文稿翻译策略研究
姓名：	王薇
学号：	10917446
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-12-08
关键词：	演示文稿翻译策略多模态语篇语篇连贯
外文关键词：	PowerPoint Presentation Multimodal Discourse Translation Strategy Text Coherence
论文摘要：	︿演示文稿在现代社会中的应用越来越多，广泛应用于报告、教学、培训、宣传、推介等目的。在跨国公司的产品全球化推广中，他们往往需要在产品宣传、推介以及产品培训中使用内容一致的演示文稿，以确保准确无误地传达信息和资讯。很多演示文稿是英语国家技术文档工程师撰写的，在其他国家使用这些英文演示文稿时，经常需要将它们翻译为本地语言，以方便使用。然而演示文稿与一般文本有着截然不同的特点。从行文上看，语句简练，常常使用省略句；大量使用缩写词；除文字外，还包含多种非文字符号元素，例如图形标志、数字图表、标注、色彩、图框、列表；每张幻灯片之间相互独立，整体篇章感不强。这些特殊之处导致了演示文稿的翻译具有不同于一般文本翻译的特点。演示文稿语篇具有哪些特殊性？应该如何选择演示文稿的翻译策略？这是本文拟研究的问题。本文首先采用语料分析法，在信息技术和金融两个领域选择了15篇专业演示文稿（涵盖403 张幻灯片内容）作为研究语料，通过人工分析的方法探讨了演示文稿中非文字符号模态的丰富性，确定了演示文稿拥有多模态性的独特文体特点，并指出这种多模态性和幻灯片之间的相互独立虽然在一定程度上影响了译者连贯地理解演示文稿的语篇意义，但演示文稿有它特有的语篇连贯特征。然后笔者在语篇分析基础上进行翻译策略研究。首先，笔者从自己翻译或审校过的演示文稿语料中选择具体示例进行译例分析，结合模态自身元素之间的互动关系以及模态之间的互动关系探讨各种模态的微观翻译策略。然后提出演示文稿的翻译应坚持语篇功能对等、语篇结构对等、衔接特征对等和最佳语境关联的宏观翻译策略。在具体语句和不同模态的微观翻译层次上，应该具体情况具体分析，灵活地运用全译和多种变译方式的微观翻译策略。﹀
分类号：	H059
论文总页数：	81
参考文献总数：	51
馆藏号：	017/M2012(772)
公开日期：	2015-12-08

2012-06-04

一种改进的动态网络最短路径算法.季海坤

链接

题名：	一种改进的动态网络最短路径算法
姓名：	季海坤
学号：	10917204
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-06-04
关键词：	时间依赖最短路径动态网络算法
论文摘要：	︿近年来，网络结点的最短路径算法被科研工作者深入而广泛地研究着，通过一系列的理论证明和实验数据，人们提出了许多经典算法，比如迪杰斯特拉算法，贝尔曼福特算法等。在诸多的经典算法模型中，每条路径的权重都是可预测和计算的，即其只适用于静态单元最短路径问题。然而，在很多实际工程的应用场景中，如GIS(Geographic Information System)应用，路径的权重却是随着时间的变化而变化的，即其具备时间依赖性，因此，只有很小部分的经典模型可以用于时间依赖网络的最短路径计算。针对这种情况，研究者们提出了一些适合时间依赖网络最短路径计算的算法，如TQQ(graph growth with double queues)，DKA(Dijkstra's algorithm with approximate buckets)，DKD(Dijkstra's algorithm with double buckets)以及ND_BA(Network Dijkstra's Bolin algorithm)算法。本文对ND_BA算法进行了研究，其算法思路为，通过初始化用户给定的出发时间区间的子集，并且对该子集的上限值迭代增加，从而不断更新道路交通网络中每个结点的最快到达时间函数，既而找出在该时间间隔内，从出发点到终结点的精确最佳出发时间，即在该时刻出发，出发点到终结点具有精确最短路径，该算法解决了时间依赖最短路径计算问题，但却以较大的功耗(空间和时间)开销作为代价。为了减少这种代价，本文通过设定边界值，对网络进行切割，过滤掉不符合计算条件的结点，得到近似的精确解，改进了基础算法。通过在模拟平台以及新加坡真实道路交通网络的实验，对两个算法进行了运行时间，近似值以及错误率三个方面的数据分析和比较。实验结果表明，随着预测值的增加，即边界值的不断扩大，本文算法所得的最佳出发时间，越来越接近基础算法的精确最佳出发时间，但却大大优化了算法的整体计算时间以及功耗(空间和时间)开销。本文算法已经发表在2011年ICCS（International Conference on Computational Science），论文名称为《Algorithm for Time-dependent Shortest Safety Path on Transportation Networks》。﹀
分类号：	TP393.092
论文总页数：	43
参考文献总数：	0
馆藏号：	017/M2012(215)
公开日期：	2015-06-04

2012-06-03

基于语料库的互联网科普作品欧化翻译研究.唐舒芳

链接

题名：	基于语料库的互联网科普作品欧化翻译研究
姓名：	唐舒芳
学号：	10917404
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-06-03
关键词：	科普翻译互联网欧化
论文摘要：	︿本研究以翻译学为核心，充分利用语料库的优势，对当前科普翻译中存在的欧化现象进行实证研究：即选择互联网上的有代表性的语料，自建译文语料库和原生语料库，对目前常见的欧化现象进行提炼归类，然后在定量研究的基础上，对比自建译文语料库和参照语料库，利用AntConc软件进行语料检索和统计分析，总结译文欧化在科普翻译中出现的情况，分析不当欧化对科普译文质量造成的影响，结论用于指导今后的科普翻译实践论证。﹀
分类号：	H059/TP311.52
论文总页数：	61
参考文献总数：	0
馆藏号：	017/M2012(346)
公开日期：	2015-06-03

2012-06-02

基于模因论视角的流行语翻译.赵阳

链接

题名：	基于模因论视角的流行语翻译
姓名：	赵阳
学号：	10917586
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	钱多秀
导师2单位：	北京航空航天大学外国语学院
论文答辩日期：	2012-06-02
关键词：	流行语翻译模因论强势模因选择标准翻译模因库
论文摘要：	︿流行语具有一定的社会属性和时代性，是一种普遍存在的综合语言现象，能够生动客观地折射出社会动态和时代变迁，同时反映人们的价值观和文化观。自上世纪末以来，流行语开始慢慢受到中国语言学界的关注，许多媒体及学者相继开展对流行语的监测、统计及语言应用研究。伴随着政治经济全球化以及中国国际地位的不断提高，希望和需要了解中国的人越来越多，流行语翻译研究这一课题也变得日益重要。然而，此领域并未受到应有的重视，关于流行语翻译的学术研究现状总体而言相对滞后且缺乏理论支撑。并且，以往的研究多为个案研究，普遍仅仅通过语言层面的译文对比等方法来探讨翻译策略和技巧，相关的翻译理论研究则十分匮乏，因而需要进行更为深入系统的研究。模因论（memetics）是基于达尔文进化论的观点解释文化进化规律的一种新理论，根据模因论的观点，模因是进行文化传播或模仿的基本单位，模因通过复制的形式在人的大脑之间通过传染来实现传播。而模因特有的传染性使它与流行语的传播具有一种潜在的联系。目前，已有不少学者运用模因论进行流行语的传播研究，而对流行语进行翻译是流行语进行跨文化传播不可或缺的环节。进而，本文尝试将模因论引入流行语翻译研究，以期为今后的流行语翻译工作起到一定的引导借鉴作用。模因在复制与传播过程中共经历四个阶段，在每个阶段都有其相应的选择标准，基于这一理论，本文尝试建构了一个模因论视角下的流行语翻译理论框架。在此框架内，对于每一条流行语，将其视为一个模因并运用其复制与传播过程四个阶段的各个选择标准对其传播与复制能力进行分析。通过研究分析发现，与强势模因相结合的流行语更易于进行传播，具有更强的传染力，更容易满足模因复制与传播四个阶段的选择标准。在此研究结果基础之上，本文提出了流行语翻译过程中的目的语模因选择标准和针对不同情况译者通常选择的策略。同时，尝试性的建构了相应的流行语翻译过程模型及流行语翻译模因库。通过基于模因论视角对流行语翻译的研究发现：流行语翻译可以被视为在目标语文化中创作强势模因的过程；模因复制与传播的四个阶段和相应选择标准对流行语翻译具有很强的解释力；流行语的翻译过程是一个多层面的选择过程。最后得出研究结论，流行语翻译并不是将单纯将原语信息忠实的转换到目标语的复制行为。而是一种创造性的、多维互动的综合过程。﹀
分类号：	H085
论文总页数：	81
参考文献总数：	90
馆藏号：	017/M2012(458)
公开日期：	2012-06-02

英文网络游戏语体特征的多维度分析.苏霄

链接

题名：	英文网络游戏语体特征的多维度分析
姓名：	苏霄
学号：	10917385
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	何卫
论文答辩日期：	2012-06-02
关键词：	网络游戏语言特征多维度分析语料库
论文摘要：	︿随着21世纪电子信息产业的飞速发展，网络游戏崭露头角，一跃成为了全球炙手可热的新兴产业。在全球商业化大环境的推动下，我国的网络游戏行业发展呈逐年上升趋势。随着海外市场的不断开发，新的产业——网络游戏翻译产业正在逐渐发展壮大。作为一个新兴的产业，有关网络游戏领域的研究涉及甚少，译员在做翻译时往往只能依靠的自身的翻译经验和网络上的资源，缺乏理论与风格的指导是一个明显的问题。因此，研究网络游戏文本中包含的一系列语言特征和相互关系，找寻该领域的语言表达特点，建立规范的理论体系和风格指导，是非常必要的。本文采用Biber创建的多维度统计分析模型，以语域变异研究的理论为基础，对网络游戏的语体特征进行研究分析。研究首先采集了包含网络游戏、学术文章、普通小说、新闻广播和面对面交谈这5个语体的语料库，拟在通过比较多个语体之间的差异，分析得出网络游戏的语言特点。随后，则使用标注系统对采集的语料进行语法特征的标注，提取并统计5种语体中“共现”的语言特征出现的频数，并在6个不同的维度上对5种不同的语体类型进行比较分析，全面并客观地揭示网络游戏语体与其他语体在语言特征上的差异，总结网络游戏的语体特征。研究表明，网络游戏文本具有高度互动性与参与性特征，呈现出能够不依赖于语境，客观而非抽象的风格，同时包含着丰富的感情色彩。另外，通过比较分析，研究发现与网络游戏语言特征最相似为面对面交谈语体，期望借助它的翻译理论和风格对游戏翻译进行指导。﹀
外文摘要：	︿ With the rapid development of electronic information industry in the 21st century, online game industry has emerged and rapidly become one of the hottest industries. Propelled by the worldwide commercialization, online game sector in China is gradually developing year by year. In addition to the domestic market, the overseas market also draws much interest in online game companies. And the continuous expansion of overseas market also plays a positive role in the growth of the industry - online game translation. As an emerging industry, there is little research on the language use and style of the online game translation. Due to the lack of theoretical guidance, it is more likely that the translators may only rely on their own experience and resources from the internet. Thus, it is necessary to build a systematic theory, identify the style characteristics and do the research on discovering the linguistic features and relevancy among the texts of online game. In this thesis, I adopt the multi-dimensional analysis model created by Biber as the methodology. Based on the theory of deviation of register, researches are conducted on the style characteristics of the online game. Firstly, the collection of text related to online game, academic prose, general fiction, news broadcast and face to face conversation has been done to build a corpus, in order to conclude the style characteristics of the online game through comparative analysis. Then, grammatical tags are added on the text within corpus, and to make statistical analysis of the "co-occurrence" linguistic features. Thirdly, the interpretation on the function of these "co-occurrence" linguistic features will be done on six different dimensions. Finally, the linguistic features of the five registers above will be compared on six dimensions, which fully and objectively reveal the style characteristics of online game. The results indicate that online game texts are highly interactive and participatory. It appears independent of the situation, objective, non-abstract and meanwhile, with abundant emotions. Furthermore, the research also points out that the face-to-face conversation register has the similar style characteristics with the online game. It is anticipatory that the study will be contributable to assist in the translation of online game texts. ﹀
分类号：	G898/H059
论文总页数：	61
参考文献总数：	38
馆藏号：	017/M2012(330)
公开日期：	2015-06-02

餐饮推荐系统的设计与实现.孔令恺

链接

题名：	餐饮推荐系统的设计与实现
姓名：	孔令恺
学号：	10917220
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-06-02
关键词：	餐饮推荐关联协同过滤
论文摘要：	︿近年来，伴随着电子商务网站的迅速发展和数据挖掘技术的巨大进步，推荐系统得以广泛应用。推荐系统的应用能够方便消费者在纷繁复杂的商品信息中快速定位自己所需要的商品，从而提高用户体验，提高顾客的粘着性，防止客户流失。本文所关注的是如何在餐饮系统中设计开发推荐系统，提高用户体验。针对在餐馆中存在大量非会员这一特点，本文使用关联算法对菜品之间的关联关系进行计算，将关联模型存入数据库中，当用户点菜的时候，可以实时调用。全部计算过程分为离线计算生成关系表、在线点菜调用关系表两个过程，将耗时较多的后台计算与对实时性要求较高的点菜调用分离开来，满足了点菜实时性的要求。针对餐饮系统的会员点餐特点，本文对传统协同过滤推荐算法做了一定的修正。通过对用户历史点菜记录的统计，以用户针对具体菜品的历史下单数量代替用户对菜品的评分。通过一定的方法计算得到用户间的相似性，排序后得到与当前用户相似度最高的邻居集合，根据他们的下单历史数据，得出在邻居集合中点菜数量的排序，并以此作为推荐依据。由于传统协同过滤算法不能体现用户的兴趣变化，而且不能体现餐饮季节性变化，本文引入了艾宾浩斯遗忘曲线和修正权重值来修正菜品的统计值。最终，笔者实现了推荐系统，并通过实验证明了本文提出的修正相对于传统算法具有一定的提高。﹀
分类号：	F713.36
论文总页数：	50
参考文献总数：	19
馆藏号：	017/M2012(229)
公开日期：	2012-06-02

在翻译工具辅助下的翻译过程实证性研究.刘小雨

链接

题名：	在翻译工具辅助下的翻译过程实证性研究
姓名：	刘小雨
学号：	10917301
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-06-02
关键词：	有声思维翻译策略翻译单位计算机辅助翻译翻译记忆库
外文关键词：	Think-aloud protocols translation strategy translation unit computer-aided translation translation memory
论文摘要：	︿随着信息化趋势不断蔓延，翻译已经不局限于过去的人工翻译，翻译软件正在翻译项目中发挥越来越大的作用。译员在享受翻译工具带来的便捷时，却很少有人想过翻译工具到底对翻译过程产生何种影响。本文采用有声思维法，探究翻译工具对翻译过程产生的影响，主要从翻译策略和翻译单位两个方面分析。本次实验以12名翻译专业研三学生作为实验对象，完成3组不计时实验，每组有大约150字翻译任务。通过对收集到的TAPs语料进行切分与对比，创建出一套对原有编码体系的补充编码。在对所有TAPs语料编码分析后，提取包含在受试者翻译过程中的翻译策略成分和翻译单位，完成对翻译过程的定性分析。在此基础上，对所有编码后的TAPs语料进行定量统计，以获取数量、频率和分布的相关数据，进行定量分析。最后，将使用翻译工具和不使用翻译工具时的翻译过程进行对比，总结出在翻译工具辅助下翻译过程的认知特点。在翻译策略种类方面，本文在Lörscher（1991，1992，1996）翻译策略编码体系的基础上，提出了使用翻译工具时的8大类19种翻译策略指标，其中包括：接受句子以上原文语段、接受句子原文语段、接受句子以下原文语段、监控翻译记忆库中提示的原文、监控翻译记忆库中提示的译文、监控术语库中提示的术语、对比待译原文和TM例句中的原文、对比待译原文和TM例句中的译文、结合TM例句组织语言、通过记忆库例句检查译文、通过语境检查译文、通过比对原文检查译文、通过比对译文检查译文、对翻译记忆库中提示的语句进行评论、对术语库中提示的术语进行评论、尝试TM例句的句法重构、翻译记忆库搜索无效、术语库搜索无效、翻译记忆库部分搜索无效。本文同时提出了6种使用翻译工具时出现的翻译策略结构，并且将其图示，绘制成了翻译策略流程图。通过对比译员使用翻译工具和不使用翻译工具时翻译策略指标的差异，我们发现译员使用翻译策略指标频率有较大不同。比起不使用翻译工具时的情况，使用翻译工具的译员相对频繁使用监控策略、检查策略。在翻译单位方面，译员在使用CAT工具时倾向于使用相对更大的翻译单位。通过对统计数据的分析，我们发现受试者在翻译过程中还存在四个问题。第一，学生译员忽略对翻译上下文的关注；第二，学生译员对翻译记忆工具过于依赖；第三，学生译员使用TM的效率受到TM提供例句的数量以及质量的影响；第四，译员使用翻译记忆库的习惯不固定。本文基于有声思维实验数据，归纳与总结译员在使用CAT工具时的翻译过程，对已有理论进行扩展和细化，为将来的研究打下基础。﹀
外文摘要：	︿ As the trend of informatization spreading across the globe, translation is not only confined to human translation. Translation software is playing an increasingly significant role in translation projects. When translators enjoy the convenience of computer-aided translation (CAT) tools, they have seldom thought about what effects CAT tools have on the translation processes.This thesis tries to explore the influence of CAT tools on the translation processes from the aspect of translation strategy and translation unit, using the method of think-aloud protocols (TAPs). In this multiple-case study, 12 student translators were asked to think aloud while translating two texts from English, their first foreign language, to Chinese, their native tongue, with the goal of identifying indicators of cognitive processes and translation units during their performance of translation tasks with the use of CAT tools. The translation strategies indicators I put forward based on Lörscher’s coding system (1991, 1992, 1996) included: Reception of SL Text Segments—Above Sentence, Reception of SL Text Segments—Sentence, Reception of SL Text Segments—Below Sentence, Monitoring of SL Text Segments in TM, Monitoring of TL Text Segments in TM, Monitoring of Term in TB, Comparison between SL in text and SL in TM, Comparison between SL in text and TL in TM, Mental Organization of TL Segments combining TM Text Segments, Checking between TM and TL, Checking in context, Checking between TL and SL, Checking between examples in TM, Checking between TL and TL, Comment on a TM Text Segment, Comment on a Terminology, Rephrasing of TM Text Segments, A Solution in TM is still to be found, A Solution in TB is still to be found. Six translation strategy patterns are summarized and graphically presented in this thesis in order to make a clear display of the processes of translation. The comparison of two experiments leads to a preliminary investigation of similarities and differences between translating with CAT tools and without CAT tools, such as (a) a higher frequency of using CHECK and MONITOR strategies while translating with CAT tools, and (b) the use of CAT tools leading to an increased length of translation units.Through the analysis of statistical data, we found that there are still four problems during students’ translation process. Firstly, student translators ignored the contexts when translating with CAT tools. Secondly, student translators were too dependent on CAT tools. Thirdly, student translators’ efficiencies were influenced by the number and quality of Translation Memory. Fourthly, student translators’ translation processes were changeable when using CAT tools.This thesis, based on experimental data, makes an induction on translators’ translation processes with the use of CAT tools. It extends and refines the theory, and lays the foundation for future research on this topic. ﹀
分类号：	H087/H059
论文总页数：	103
参考文献总数：	61
馆藏号：	017/M2012(280)
公开日期：	2012-06-02

语料库驱动的中英旅游英文官网语体差异分析.雷雯霆

链接

题名：	语料库驱动的中英旅游英文官网语体差异分析
姓名：	雷雯霆
学号：	10917229
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-06-02
关键词：	语料库语体分析旅游
论文摘要：	︿本研究中的语体指建立在同义性基础上，因语境的不同而引起的说话方式的不同。选取的语料来自中国与英国的旅游官网，中方的是：中国国家旅游局、北京旅游局、上海旅游局官方网站；英方的是：英国官方旅游网、约克郡旅游局、伦敦旅游局官方网站。本文先概述了研究所涉及的理论基础，然后阐述了研究所用的软件工具与方法，最后给出了详细的研究结果，包括图表、正则表达式与附录。本研究将语言学知识与正则表达式结合起来，着重分析了以下基本统计量：平均段长、平均词长、词汇密度、平均句长、句类、词类、现在时与过去时，以及语体的正式性、客观性、紧凑性；最后使用多维度多特征方法，对有关语言特征进行了统计学上的因子分析，得到了两个语言变化维度，其中之一命名为说服/说明因子。得到的的其它结论有：英方平均段长较短，词汇更丰富，平均句长更长，大量使用祈使句、感叹句，以现在时为主；中方平均段长更长，标点符号使用稀疏，以大量的陈述句为主、使用祈使句感叹句少，更多使用名词与过去时。另外，中方在语体上正式性、客观性、紧凑性程度高、；英方的语体则随意活泼，表现出很强的非正式性。从语境理论与文本信息理论来阐释可知，英方将网页访问者预设为关系熟络的朋友，创设的文本氛围富于情感变化与口语特色，且不时呼唤读者亲身去目的地体验探索，展现了英方较强的营销意图；中方语言则十分缺少与读者的互动，重点放在客观详尽描述旅游目的地的信息上，且用语正式、去人情化，因给读者造成了沉闷、疏离的印象。﹀
外文摘要：	︿ Register in this study – Differences in Register between the Official English Tourism Websites of China and the UK: A Corpus-driven Analysis - refers to the different speech patterns which are based on the synonymy and caused by different context. The tourism official websites include websites of China, Beijing, Shanghai, England, Yorkshire and London. This article first outlines involved theoretical basis and then introduces softwares, tools and methods used in the study. Finally, detailed study findings, including charts, tables, regular expressions and appendices, are presented. Then differences in register are analysed from the following aspects. The first one is basic stylistic statistics, including average paragraph length, average word length, lexical density, average sentence length, sentence category according to function, parts of speech, present and past tense. Then formality, impersonality and compactness of language are examined according to relative theories and by means of regular expressions. Finally, by applying the multidimension/multifeature method, two factors are obtained from factor analysis of nine linguistic features through running statistical software – SPSS. One of the two factors is explained and named persuasive factor. In addition, websites from England have shorter average paragraph, richer vocabulary, longer average sentence and make more use of imperative and interjectional sentence and present tense. By contrast, websites from China have longer average paragraph, more punctuation and make more use of declarative sentence, nouns and past tense. Moreover, language from China websites are more formal, impersonal and compact; language from England websites are more casual and lively, showing a high degree of informality. Viewed from Context Theory and Newmark’s Text-Typology, websites from England can be said to take audience as familiar friends, which results in webpages with richer emotional change and more oral characteristics; furthermore, the authors of websites from England, from time to time, call readers to visit the destinations and participate in person, which shows their strong marketing purpose. By comparison, websites from China focus on elaborate description of destinations and pay little attention to direct interaction with readers, leaving readers an alienated and inert impression. The findings are constructive in helping China to build more effective international tourism websites. Further work lies in expanding the corpus to conduct more detailed and extensive study. ﹀
分类号：	H059/TP391
论文总页数：	75
参考文献总数：	0
馆藏号：	017/M2012(236)
公开日期：	2012-06-02

基于LDA模型的博客主题提取.王珍

链接

题名：	基于LDA模型的博客主题提取
姓名：	王珍
学号：	10917460
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-06-02
关键词：	LDA模型关键词提取无指导新词发现关键词评判标准
外文关键词：	LDA model Keyword extraction Unsupervised New words detection Keywords quality criteria
论文摘要：	︿博客(Web Blog)，是一种通常由个人管理、不定期张贴更新的文章的网站[1]。随着计算机技术、通信技术以及互联网的迅猛发展，越来越多的人通过在网上撰写博客来表达他们对事物的情感和观点，博客上的信息呈现指数增长。在这种由信息爆炸所带来的挑战下，人们迫切呼唤一些自动化工具来帮助他们从浩瀚的信息汪洋中准确、便捷地找出所需的关键信息。主题提取正是在这种背景下产生。本文的主要工作和创新点在于：一方面，本文分析了基于统计的流派和基于规则的流派在主题提取任务上各自方法的优劣，继承前人优点的同时，还针对几种主流方法的缺点，进行了改进。作者将LDA这种相对高级的模型应用于主题提取，工作之一是对关键词进行加权；工作之二，同时也是LDA模型求解的一个难点，在于确定主题数目，本研究使用的是基于密度的自适应主题数法；工作之三是筛选候选关键词词组的数目。另一方面，由于互联网环境瞬息万变，博客文章中常常出现一些新词，而主题词的提取会很大程度的受到分词结果的影响，如果分词程序不能将这些词语或短语正确的识别出来，那抽取出正确的关键词也会变得十分艰难。因此本研究将著名的ICTCLAS程序做了优化，使其能发现新词和短语，并切分出语义信息丰富的粗粒度词语。此外，长期以来，尽管研究关键词的学者许许多多，然而对于关键词的定义以及关键词的评判标准，一直都处于含糊不清或者看法各异的状态。关键词的判断和标引确实非常主观、因人而异，但这种主观判断中也有相对客观的衡量标准。作者结合前人观点，进一步提出了清晰明确的关键词判断标准，使得什么是合适的、正确的关键词有了可依赖的特征。为了使得到的关键词组能更直观的表达出文章的主题，作者还用了数据可视化的方法，将提取出的关键词组以不同颜色和大小展现出来。﹀
外文摘要：	︿ A blog is a personal website commonly managed by individuals for publishing posts [1]. With the development of information technology, more and more people express their feelings and opinions by writing a blog on the Internet. The challenge of information exponential growth on the blog calls for some automated tools, to help them locate the needed key information accurately and conveniently. Topic extraction then comes with the tide of fashion. The main work and innovation of this article are: On one hand, this paper discusses the pros and cons of topic extraction based on statistics and rules. While inheriting those advantages, this paper also improves several disadvantages of the previous work. The author uses the relatively advanced LDA model in topic extraction, one work is to weight the keywords; the other, also a difficulty of using LDA model, is to determine the number of topics. The solving way of this paper is a method of adaptively selecting best LDA model based on the density; the third is to select the best number of candidate keywords or key phrases. On the Other hand, due to the flourishing evolution of Internet language, new words come one after another. Since topic extraction can be greatly influenced by the segmentation results, poor segmentation will bring poor keywords. This paper optimizes the famous ICTCLAS program, so that it can automatically discover new words, and output words with rich semantic information. In addition, for a long time, since there are many scholars research on keywords, the definition and quality criteria for keywords are still in vague or in different views. The judgment of keywords is indeed very subjective and varies from person to person, while it still can be measured in a relatively objective way. The author integrates the previous work of this area and further proposes a clear keywords quality criteria, making what is a appropriate keyword becomes characteristic. In order to make the topic key phrases more intuitive, the author also uses data visualization to present the extracted key phrases in different colors and sizes. ﹀
分类号：	TP393.07
论文总页数：	55
参考文献总数：	30
馆藏号：	017/M2012(384)
公开日期：	2015-06-02

基于英语单语语料库的英汉翻译实务应用研究.李南哲

链接

题名：	基于英语单语语料库的英汉翻译实务应用研究
姓名：	李南哲
学号：	10917253
论文语种：	chi
专业：	软件工程
公开时间：	1年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	何卫
导师2单位：	外国语学院
论文答辩日期：	2012-06-02
关键词：	语料库翻译实践 BYU Sketch Engine 改进语料库
论文摘要：	︿上世纪90年代以来，Mona Baker，Sara Laviosa等翻译研究者开始在翻译研究中引入语料库，从而使基于语料库的“定量”和“描写性”的研究方法在译学研究领域发展起来。经过近二十年的发展，语料库技术不断提高，基于语料库的译学研究方法不断进步、内容不断深化，取得了众多成果。但是对于使用单语语料库辅助翻译实务的相关研究却微乎其微。从以往基于语料库的译学研究和翻译教学研究来看，语料库是辅助翻译实务的手段之一，有利于提高译文的准确性，使译文更加地道。从翻译认知角度过程来看，语料库及语料检索系统能够有效弥补译者心理词汇上的不足，解决翻译过程中出现的困难和问题，成为翻译过程中查询参考的利器。据此，本文借助Sketch Engine和BYU等语料库检索系统，通过使用现有的大规模英语单语语料库，结合具体翻译实例，论述单语语料库在翻译实务中的使用，并根据翻译实务的具体需求提出改进语料库的设想。本文分为六个章节，绪论章节回顾总结了前人的研究成果，阐述了本文的重点、难点和创新点。第二章节对语料库进行介绍并从实用性的角度论证了单语语料库应用于翻译实务的可行性。第三章节是“提出问题”的章节：探讨了翻译过程以及翻译过程中遇到的困难和问题。第四章节旨在“解决问题”：首先介绍了Sketch Engine和BYU语料库检索系统，之后结合具体翻译实例，阐明单语语料库在翻译实务中的具体应用。第五章节提出了“改进解决方案”的设想，根据译者翻译实务中的具体需求，尝试提出改进语料库查询功能的想法。第六章节对本文进行总结，提出本文的研究局限以及未来的研究方向。﹀
分类号：	H087
论文总页数：	76
参考文献总数：	63
馆藏号：	017/M2012(248)
公开日期：	2013-06-02

中医对外援助工作的翻译研究及术语库建设.范慧阳

链接

题名：	中医对外援助工作的翻译研究及术语库建设
姓名：	范慧阳
学号：	10817116
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-06-02
关键词：	中医对外援助计算机辅助翻译语料库术语库
论文摘要：	︿对外援助项目是我国外交不可或缺的组成部分，一直受到我国政府的高度重视,彰显着文明古国的优秀传统和人道主义精神的传承。在对外援助的发展历程中，特别是最近十几年间，规模不断扩大，内容不断丰富。越来越多的发达国家和发展中国家对中医的博大精深和不可思议的疗效产生了浓厚兴趣。中医对外援助工作对中医文化的传承和传播，以及国家形象的树立做出了积极的贡献，其顺利开展意义重大。然而，中医对外援助工作同时遇到对外援助管理的困难和中医文献翻译的困难。受援对象来自不同地域，各自的文化习俗、思维习惯、法律法规不尽相同，对中国援助的期望也不同，影响着对外援助工作的完成。中医文献的专业词汇特性和东西方的天然语言障碍使得中西医术语的统一、文言词汇的翻译等问题在中医对外援助工作中显得更为突出。另一方面，中医对外援助工作的语言问题也存在于管理服务工作和知识管理问题中。由于管理类文件的特殊性，其格式和内容相似度高、相应的翻译工作反复频繁，纯手工完成往往是低效的。现阶段的对外援助工作缺少知识管理体系，素材、资料的积累和查询不便，复用率低，直接影响了知识复用和传承。诸如此类的问题严重阻碍了中医对外援助工作的开展。基于本人在实际工作中遇到的问题，本文结合中医对外援助工作的行业特点和典型场景，提出使用语料库、术语库对中医对外援助工作进行支持，并设计了语料库的建库流程和步骤。语料库、术语库是计算机辅助翻译的常用工具，能够满足计算机辅助翻译的需要，其技术之成熟及对翻译工作的辅助作用毋庸质疑。然而，由于中医对外援助工作设计到多种语料，例如病历、课程表、课件、文件、法规、生活指南，且各类资源的使用交错频繁，目前尚无某类语料库、术语库或任何解决方案能专门应对中医对外援助中的复杂应用场景。本文创新性地提出使用中心化语料库的语言资产组织方式对各类资源进行综合管理，在维护语料数据的同时可标注语料数据所属的领域或专业范畴。中心化语料库同时带有数据导入功能，及可定制的领域数据导出功能——能够将某个关注领域或专业范畴内的语料数据导出为AntConc、ParaConc、Trados等语料库软件能够识别的数据文件格式。另外，为方便使用，中心化语料库同时提供了无需导出、直接由数据库支持的语料查询界面。本文设计了自主编辑术语库和从语料库导出术语库两种术语库建库方法。在研究术语库的语料获取、术语抽取等问题的同时，以中医对外援助工作为基础进行了术语扩展的尝试，对中医对外援助工作提供诸如常见错误译法、习惯搭配、术语审定者之类的辅助提示和知识参考。在此基础上，本文在实际的翻译实践中对语料库、术语库的建设、使用及互联网资源的使用进行了验证。结果表明，本文的语料库、术语库建设方法是可行的。在使用了本文建设的语料库、术语库后，能够有效地应对 “同词不同意”、翻译一致性、翻译材料重复性高、需要贴近当地语言风格、文言文翻译难度大等实际工作中遇到的常见问题。本人在实际工作中的实践经验表明，使用本文设计的语料库、术语库辅助中医对外援助工作有效地应对了实际问题、减轻了工作难度，实现了本文的目标。最后，本文从术语扩展的思路出发，从容纳翻译工作本身产生的知识库，到通过术语词汇链接至法律法规文件等，设计并通过MediaWiki建立了一个中医对外援助工作的翻译知识管理雏形系统。应用结果表明，该系统不仅能够协助项目组的翻译服务工作，而且能够为参与中医援外实际工作及教学工作的医护工作人员、接受援助的医护工作人员以及援外管理工作提供与中医援外相关的语言服务保证。本文研究工作是对本人实际的实习工作的升华和继续。本人所在工作团队对计算机辅助翻译为中医对外援助工作带来的便利表示了极大肯定。继续将这个研究落实到实际工作中并发扬光大有着广阔的前景和意义。﹀
分类号：	TP311.52/H087
论文总页数：	77
参考文献总数：	0
馆藏号：	017/M2012(081)
公开日期：	2012-06-02

基于主题模型的全宋词语料库构建以及计算机辅助宋词创作研究.黄子轩

链接

题名：	基于主题模型的全宋词语料库构建以及计算机辅助宋词创作研究
姓名：	黄子轩
学号：	10917203
论文语种：	chi
专业：	软件工程
公开时间：	1年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2012-06-02
关键词：	计算机辅助诗词创作参考词推荐 LDA模型
论文摘要：	︿汉语古典诗词的计算化工作始于20世纪90年代中期，无数专家学者曾在包括语料库建立、词汇语义分析、诗词风格分析、联句应对、诗词自动生成等方面进行了研究并取得了一定的成果。总的来说，相关工作主要针对两个方向，一个是利用计算机来分析诗词，研究诗词中的用语特点、词语变迁、风格特征等等。另一个是利用计算机来创作诗词，如俳句的生成、诗词的自动生成等。本文的关注点主要在于利用计算机快速的计算能力与检索能力，结合相关领域如词汇语义分析、诗词自动生成等方面的研究成果，来帮助诗词爱好者更好的创作诗词。本文将主要讨论计算机辅助人工诗词创作中的一个新的研究方向——参考词推荐，即在人创作诗词的过程中，利用计算机的快速检索能力以及已有的语言知识，最大可能的为创作者提供可能有用的参考词，整个过程中，计算机的作用就是利用各种已有知识最大限度的缩小参考词范围，以期达到较为有用的参考词推荐，但创作本身是由人来完成，可以说，计算机在其中扮演一个智能检索系统的角色，帮助创作者整合各种知识，提出可能有益的意见，却不会对创作本身的乐趣——思想的表达、感情的抒发等造成任何影响，这也是诗词爱好者在进行创作时最为渴求的工具。针对参考词推荐这一任务的特殊性，本文结合宋词本身的句法特点完成了全宋词的分词与音韵标注语料库的构建；针对词汇的情感风格分类方面，本文采用了主题模型中的LDA模型，将风格获取任务转换为主题获取任务，从而得到了词汇的风格分类数据；针对词汇的搭配获取任务，本文利用宋词的句法特点对句子进行了适当的切割，从而有效的减少了无效搭配的数量。最终进行参考词推荐时，将综合考虑格律、搭配以及主题风格，将可选的参考词数量缩小到易选取的级别，同时亦尽可能的保留可能有用的参考词，不做过度的筛选。﹀
分类号：	TP391.72
论文总页数：	56
参考文献总数：	40
馆藏号：	017/M2012(214)
公开日期：	2013-06-02

一种对外汉语教学辅助软件的开发.韩捷

链接

题名：	一种对外汉语教学辅助软件的开发
姓名：	韩捷
学号：	10817140
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	李素建
导师2单位：	信息科学技术学院
论文答辩日期：	2012-06-02
关键词：	对外汉语文本分类自动评分
论文摘要：	︿随着中国国力的不断增强，“汉语热”、“汉文化热”等热度不断攀升。中国国家汉语国际推广领导小组办公室非常重视对外汉语教学，积极进行对外汉语教学的相关工作。对外汉语教学事业蓬勃发展，同时也存在一定的问题。作为对外汉语教学中信息技术应用的探索，本文介绍了一种对外汉语教学辅助软件的开发。本软件主要针对两个问题展开：对外汉语写作教学中教师负担问题以及对外汉语阅读教学中的教材问题。为此，笔者开发基于回归方法的作文自动评分系统以减轻教师工作负担，并开发基于语料库的对外汉语教材定制系统解决阅读教学中的教材问题，同时为语料库提供了例句检索系统，方便教师搜集合适的例句。首先，本文对对外汉语写作教学中的教师负担问题进行了研究，并初步研究和实现了作文自动评分系统来帮助教师提高工作效率。在作文自动评分方面，因为特征选择对于回归的评分结果影响较大，根据对外汉语教学中，作文的结构简单，内容少的特点，笔者分析了搭配、句子长度、句子复杂度等特征。提出了搭配特征提取的具体算法。笔者使用词语、句子、文档、搭配等特征，比较了四种回归方法：线性回归、支持向量回归、简单线性回归和K*回归。笔者对实验结果进行了分析，在实验中，综合性能最好的是线性回归。其次，本文对对外汉语阅读教材中存在着缺少针对性教材的问题进行了分析，并通过教材定制和例句检索系统的设计和实现提出了解决方案。在教材定制方面，根据语言教学的相关理论，重点考虑的因素包括兴趣、目标性、可读性和主题选择。本文以语料库技术为基础，研究并开发了一套教材定制系统。教材的定制过程可以看作是一个自动文本分类的过程。因为学习者通常对比较新颖的材料感兴趣，所以在从语料库中提取语料时就提取最近更新的语料，目标性、可读性和主题选择都通过自动文本分类过程来进行选择。同时，还为教学语料库提供了例句检索接口，方便教师检索更为合适的例句。最后，进行了总结与展望，展示了现代信息技术应用于对外汉语教学实践所能产生的巨大生产力。﹀
分类号：	TP311.5/H087
论文总页数：	67
参考文献总数：	38
馆藏号：	017/M2012(083)
公开日期：	2012-06-02

基于语料库的二本院校非英语专业学生写作现状研究.赵巍

链接

题名：	基于语料库的二本院校非英语专业学生写作现状研究
姓名：	赵巍
学号：	10917583
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	钱多秀
导师2单位：	北京航空航天大学外国语学院
论文答辩日期：	2012-06-02
外文题名：	A Corpus-based Study on English Writing of Non-English Majors in Level-B College
关键词：	写作语料库错误分析对比分析
外文关键词：	Writing Corpus Error Analysis Comparative Analysis
论文摘要：	︿英语写作能够体现出学习者对语言的综合运用能力。对广大非英语专业的学生来说，写作是难以攻克的难题；而对英语教师来说，作文的批改任务繁重而又收效甚微。传统的作文批改方式已经无法全面细致地归纳出学生的写作特点以及学生写作中的常见错误及严重错误，而语料库研究方法为解决这些问题提供了一个有效的途径。研究者可以利用语料库存储的大量数据，在此基础上使用各种检索管理软件对语言特征进行定量的描写和定性的分析，进行各项具体的研究工作。本研究以对比分析理论和错误分析理论为理论基础，以黑龙江科技学院（二本院校）非英语专业大一、大二的学生为研究对象，采集语料后建设了一个小型语料库——FSWC（Freshman & Sophomore Written Corpus）作为研究语料库。研究内容主要包括：利用国外本族语大学生写作语料库LONCESS中的子库USARG以及中国学习者英语语料库CLEC中的子库ST3作为参照语料库，使用了语料库检索软件WordSmith作为研究工具，得到三个语料库的类符/形符比、词长句长和主题词的相关数据。然后将FSWC中抽取60个文本作为测试文本进行错误标注，并将错误的种类和出现频率进行统计，找出学生写作中存在的严重问题及普遍问题。研究结果显示该校二本学生在写作方面有以下特点：第一，虽然他们在写作中用词的多样性较高，但是他们选择的词汇多数比较简单，且词汇的错误率也较高。第二，这些学生倾向于使用简单句来进行写作，并且在写作中多使用“第一人称”进行叙述和表达。第三，学生写作中被动语态使用较少，情态动词的使用也很单一化。第四，学生的写作错误主要集中在句法、词汇和动词三大类，几种最严重的错误分别为拼写错误、标点错误以及替代错误。这些研究结果以真实的数据资料为证据，揭示了特定二本类院校学生的写作特点，发现了本人及所在大学公共英语教研室在教学中的优势与劣势。在此基础上，作者对学生写作特点及写作问题产生的可能原因进行分析和推断。最后，根据以上结论，提出了相应的教学策略，并通过实验证明了“利用语料库的辅助作用、多种写作评价模式和多种写作练习方式相结合”的写作教学模式，要在诸多方面优于传统的写作教学模式。本研究将语料库方法用于研究和指导教学的一次尝试，实践证明，这一方法对大学英语写作教学有一定的促进作用。﹀
外文摘要：	︿ English writing, which reflects the learners’ English proficiency, has long been a headache for non-English major undergraduates. Meanwhile, English writing teaching has also been a weak point in college English teaching, a time-consuming but an ineffective job. The traditional feedback method fails to find out the features of students’ writing and reveal the frequent errors and serious ones in their writing. The corpus-based approach has come into a wider use in language researches and yielded more effective results. Corpus is a collection of linguistic data. Researchers tend to make use of various corpus retrieval and management software to perform the quantitative analysis and qualitative analysis designed for facilitating different kinds of research jobs. The study, based on Error Analysis theory and Comparative Analysis theory, involves developing a small-scale writing corpus FSWC, composed of 300 compositions written by non-English major freshmen and sophomores studying in Heilongjiang Science and Technology Institute (a Level-B college). The study consists of three major phases: firstly, comparing FSWC with USARG and ST3 and getting the data from three corpora concerning type/token ratio, average sentence length, average word length ,and keywords, secondly , choosing 60 compositions from FSWC at random as research samples, and recognizing, analyzing ,and tagging errors in these samples, and finally, counting the number of all errors and each specific error in order to find out the serious and most frequently occurring errors in students’ writing. The study shows that students are more likely to write in ways marked by the following findings: 1. Despite the greater lexical variety in FSWC, the lexical errors in it occur at a higher rate. 2. Students, fond of being the first-person narrators, are more likely to choose simple and short words and sentences to express their ideas. 3. Passive voice rarely occurs in students’writing , which necessitates an improvement in the their command of modal verbs. 4. Their research samples reveal the three most common types of errors, namely, sentences, word class and verbs errors. The three subtype errors most frequently found in their writings are spelling, punctuation and substitution errors. These findings, building on the real data, offer an insight into the students’ writing features and advantages and disadvantages in the author’s and her colleagues’ English teaching. The causes of students’errors are analyzed and discussed. Some remedial methods, including the application of corpus approach and the combination of a variety of evaluation methods, have come into operation in the experimental group’s teaching. The consequent experiments have proved that these methods are helpful to the improvement of college English writing. The research is an effort to apply corpus approach to directing and assisting language teaching and learning, a proven approach which sheds some light on the teaching of English writing. ﹀
分类号：	H087/TP311.52
论文总页数：	65
参考文献总数：	43
馆藏号：	017/M2012(456)
公开日期：	2012-06-02

2011-12-10

面向学术论文计算机辅助翻译的受限汉语研究.王雷

链接

题名：	面向学术论文计算机辅助翻译的受限汉语研究
姓名：	王雷
学号：	10648887
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	常宝宝
论文答辩日期：	2011-12-10
外文题名：	A Study on Computer-aided Translation of Academic Texts Based on Controlled Chinese
关键词：	学术论文语言特点受限汉语计算机辅助翻译机器自动翻译
外文关键词：	Academic texts Linguistic features Controlled Chinese Computer-aided translation Machine transaltion
论文摘要：	︿近一百多年来，随着东西方文明的不断碰撞，国内知识界翻译了大量的外国文献。在翻译引进的过程中，以汉语为代表的汉藏语系语言受到了以英语为代表的印欧语系语言的影响。这种影响使汉语无论在词汇方面还是在语法方面，都发生了一些变化。追溯这些变化发生的主要原因，我们看到翻译在其中起了非常重要的作用。这些作用主要体现在三个方面：西方来华的传教士们对基督教经典文本的翻译，清末国内知识分子翻译的外国文学作品和对以日语教科书为代表的、从日本引进的大量科技、法政、社会类书籍的翻译工作。在词汇、词组（短语）和句法结构方面的变化使现代汉语在某种意义上与翻译源语——英语有着一定的相似性。因为近代中国受西方的影响，一直以输入西方科学和技术为主，所以汉语的这些变化突出体现在科学和技术领域。除了与翻译有密切关系的科技专著，作为阐述科学原理和论述科学规律的重要工具之一，学术论文的翻译在促进科学进步、传播科学知识的过程中起着不可替代的作用。相比于其他文体，学术论文有着自身的语言特点，如词汇集合相对较小、语义单一、词类比较确定、句式结构比较规范等。这些特点也促使我们可以循着汉英两种语言在特定领域、特定文体中具有相似性这一规律，设计出一种专门用途的受限汉语以简化汉语相对于英语比较复杂的表达方式，实现利用计算机作为工具辅助翻译学术论文的工作向“人助机译”的目标更进一步，一定程度上提高翻译的效率和质量。本文研究工作的意义集中体现在以下三个方面： 1. 本文首先选取典型的传教士翻译的欧化白话文文本，经过与古白话文文本进行比较，阐述了汉语语法结构发生的变化，并对两种语言之间的关系做了历时论证，表明现代汉语语法借鉴英语语法进行分析是可能的。然后，本文在考查了学术论文这种特定文体语用特点的基础上，从词汇、词组（短语）、句子三个层面分析并归纳了汉英两种语言的异同点，提出了面向学术论文计算机辅助翻译的受限汉语的设计原则与方法。 2. 本文针对学术论文计算机辅助翻译的需要，基于现有的相关研究工作，构建并完善了以计算语言学学科为研究对象的学科相关词汇集、学术论文通用词汇集、计算语言学术语库等用于翻译学术论文的基础资源，同时提出了上述词汇集合的受限处理方案，确定了相应的当用词集。这些词集可以作为利用计算机辅助翻译学术论文有价值的参考信息，帮助译者提高工作效率与质量。同时，这些资源对于计算语言学学科自身的研究和发展也会起到重要的作用。本文还着重介绍了北京大学计算语言学所建设的面向中文信息处理的成语知识库，探索了成语知识库在计算机辅助翻译中的应用。在实际应用中，成语知识库中的条目增加如英译、情感色彩等字段，不但可以给翻译者提供更多的参考信息，还对语言研究、对外汉语教学等工作具有重要的意义。 3. 为了验证本文提出的面向学术论文计算机辅助翻译的受限汉语方法的有效性，我们利用国际通用的评测机器翻译结果的工具，对中文学术论文中与英语形式上相差较大的词缀词、兼类词、量词、只起语法功能的助词等词类；出现频率较高的介词词组、以名词为中心词的n-gram词组；汉语典型的省略主语句、存现句、宾语前置句、前置的长定语结构、连动式和兼语式等特殊句式做了受限处理并翻译后的文本进行了评测。评测结果表明无论在人工辅助下还是由机器自动翻译，经过受限处理和改写后的中文学术论文文本经翻译后比直接进行自动翻译会获得更好的效果。最后，本文还结合向量空间模型利用自动聚类的方法归纳了学术论文中经常出现的句式模板，并用数据库存储起来帮助译者在翻译论文时进行参考。在以上研究工作的基础之上，本文最后提出建立一个学术论文计算机辅助翻译的平台。该平台基于Déjà Vu计算机辅助翻译系统，并利用谷歌开放的应用程序接口开发了自动翻译模块。此外，本文还介绍了学术论文翻译后编辑和校对工作时应该注意的一些事项以及可以用于管理参考文献的工具。本文研究工作的主要创新点在于：从翻译史的角度用统计和计算的方法定量地研究了近代汉语发展演变的规律；利用实证的方法验证了在特定领域、特殊文体中汉英两种语言在词汇和语法上具有相似性；把学术论文作为研究对象，提出了一种面向学术论文计算机辅助翻译的受限汉语设计方法，并利用基于规则和基于机器学习等自然语言处理技术自动设计构建了相关的语言知识库，并基于知识库中的词语集确立了相应的受限语言当用词集；针对统计机器翻译难以使用词典的问题，提出了一种在系统中使用特定ID代表词或语块，翻译时先把词或语块识别为OOV，翻译后在结果中还原为翻译词或语块的方法；利用该方法设计了相关实验验证了这种专门用途的受限汉语可以在计算机辅助翻译和机器自动翻译两方面提高学术论文翻译工作的效率和质量。﹀
外文摘要：	︿ During the past century, with the constant collisions of oriental and occidental civilizations the Han-Tibetan language family represented by Chinese and the Indo-Euro language family represented by English have been exerted much influence in the process of translation and introduction. This influence results in numbers of changes on the lexical and syntactic structures of Chinese language. The changes are mainly traced back to the Christian classical texts translated by western missionaries, the Chinese translators’ works in late period of Qing Dynasty(1636-1911) and the introduction of Japanese textbooks. All the above-mentioned factors contributed to the birth of modern Chinese, which shares great similarities in terms of translation with its source language-English from lexical, phrasal and syntactic perspectives. For the fact that China has been introducing science and technology mainly from the west, the changes of Chinese language particularly manifest themselves in the two fields. As the major instrument of stipulating scientific theories and natural laws, academic texts play an essential role in facilitating the progress and disseminating knowledge of science and technology. Compared with other types of literature, Chinese academic texts demonstrate its own characteristics, e.g. relatively small register and vocabulary, distinct word classes and regular syntactic patterns, which will enable us to design a particular controlled Chinese that will serve as the bridge to narrow the gap between the two languages and improve both computer-aided translation and automatic machine translation. The significance of our research lies in the following aspects: 1. With carefully chosen texts from both Euro-style Chinese and ancient Chinese, a comparison is made in this paper to reveal the phenomenon of grammatical changes in Chinese language and elaborate on the historical truth in this unique transition period of Chinese language – late Qing Dynasty. 2. Based on up-to-date relevant research, this paper aims to the needs for computer-aided translation of academic texts and constructs several lexicons in the domain of computational linguistics. They include: Subject-related lexicon, Universal lexicon for academic texts, Term bank of computational linguistics, etc. As an important resource of reference for computer-aided translation for academic texts, these resources help to improve efficiency and quality and facilitate the research and development for the subject itself. This paper also elaborates on the construction of Chinese Idiom Knowledge Base by the Institute of Computational Linguistics at Peking University with a focus on introducing its applications in computer-aided translation. Tagged with such information, translators will be provided with useful knowledge and good reference in their work. The knowledge base is also of great help to language research and teaching Chinese as a foreign language. 3. In order to verify the validity of controlled Chinese for the purpose of computer-aided translation of academic texts, we pragmatically controlled words with affixes, terminologies, measure words and function words in academic texts on the word level, prepositional phrases and n-gram nominal phrases headed by nouns on the phrase level, existential sentences, attributive clauses, premodifying objects and subject ellipses. Experiments show that a better performance will be obtained with our method both in human-aided translation and in automatic machine translation. At last, we summarize the syntactic patterns of academic texts by using a clustering method combined with VSM model and store them in the database for translators’ reference. Based on the above research, we also propose a scheme of establishing a Déjà Vu-based platform of computer-aided translation system embedded with Google Translation API as the automatic translation module. Besides, this paper provides some information on post-editting and correction of translation work and a tool for reference management of academic texts. The innovations of our research work lie in that: In the light of translation history, we expose the law of development of Chinese language by a statistical and computational method and prove with solid evidence that there are similarities between Chinese and English in specific domain in both lexicon and grammar. We propose a method of controlled Chinese for the particular purpose of computer-aided translation of academic texts, by using NLP techniques such as rule-based and machine learning methods, construct several lexicons that include both the chosen words and the non-chosen words of our controlled language based on the simlilarities of English and Chinese and complete experiments that aim to verify the validity of our method. As for the problem that it is difficult for SMT system to integrate dictionaries, we propose a method that a certain word or chunk can be represented by a particular ID, with which the original text is put into the system. In the process of translation, the ID will be treated as an OOV and after translation it can be replaced by its target word or chunk. The results show that our method is able to improve both the quality and efficiency of computer-aided translation and automatic machine translation of academic texts under investigation. ﹀
分类号：	H087
论文总页数：	119
参考文献总数：	0
馆藏号：	048/D2011(72)
公开日期：	2011-12-10

2011-11-28

本地化用户文档系统功能分析及其翻译.张凤

链接

题名：	本地化用户文档系统功能分析及其翻译
姓名：	张凤
学号：	10817467
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	何卫
导师1单位：	外国语学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2011-11-28
外文题名：	Systemic Functional Analysis of Localization Documentation and its Translation
关键词：	功能对等用户文档本地化翻译
外文关键词：	functional equivalence user documentation localization translation
论文摘要：	︿本文基于韩礼德的系统功能语法理论，尝试从系统功能语法的角度，对英文用户文档本地化翻译实践进行探讨，以研究本地化用户文档与原文达到动态对等的最佳途径。作者将功能语言学运用到用户文档语言特征的分析和用户文档本地化翻译方法的研究中，通过从用户文档的词语、句子以及语篇这三个层次进行分析，作者发现，用户文档大量使用简单动词，其术语有四种构成类型，通过祈使句传递情态意义，使用主位模式和语法衔接手段保持语篇语义的连贯。针对用户文档的这些特征，笔者结合语言的功能，提出了概念对等直译法、概念对等意译法以及语境结合法，以解决术语翻译难题。同时，笔者认为，术语翻译还需注意地域性文化因素对本地化翻译的影响。最后，鉴于用户文档语篇本身的功能，本文认为在翻译的过程中还需要译者从整体上对语篇进行灵活调整，保证语篇语义衔接的一致性和连贯性，最终使用户文档实现其传递信息的功能。本文运用系统功能语法理论，对翻译人员在翻译实践中常遇到的翻译难题提出了具体的解决方法，为今后的本地化用户文档语言的研究和用户文档的翻译开辟了一条新的道路。﹀
外文摘要：	︿ Based on the theory of systemic-functional grammar which developed by M.A.K Halliday, this paper attempts to study localization translation practice of documentation in the view of functional grammar, in order to explore the proper way of dynamic equivalence between localization translation and original text. The author applies systemic-functional grammar into the research of user documentation from three levels: words, sentences and texts. And, the author has found that user documentation tends to use simple verbs, four types of terms; its modal meaning is expressed in imperatives; moreover, the way to keep semantic coherence lies in thematic progression and grammatical cohesive devices. The author continues her research in documentation translation from the features of three levels and proposes some translation methods, such as conceptual-equivalent literal translation, conceptual-equivalent liberal translation and context-translation method, to solve the difficulties in term-translation. Meanwhile, regional difference should also be considered into localization translation process. Finally, according to the functions of user documentation itself, the translator should adopt relatively flexible way in documentation translation, to keep the text cohesive, and finish its function on information delivering at last. Using the theory of systemic-functional grammar, this thesis puts forward solutions to settle the non-equivalent issues that translators meet during their localization translation practice, and explores a new ground for research in aspect of user documentation features as well as its translation in the future. ﹀
分类号：	H04/TP311.13
论文总页数：	55
参考文献总数：	0
馆藏号：	017/M2011(680)
公开日期：	2011-11-28

名化在科技语篇汉英翻译中的应用.杨倩

链接

题名：	名化在科技语篇汉英翻译中的应用
姓名：	杨倩
学号：	10817432
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	何卫
导师1单位：	外国语学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2011-11-28
外文题名：	The Application of Nominalization in C-E Translation of Scientific Discourse
关键词：	名化科技英语语料库对比分析翻译
外文关键词：	Nominalization EST (English for science and technology) Corpus Contrastive Analysis Translation
论文摘要：	︿由于中国已经参与到全球化的进程之中，伴随着各国之间科学技术交流的日趋频繁，科技语篇的汉英翻译也逐渐显现出其重要性。作为英语科技语篇的一个显著特征，名化结构有助于使其表达更加简明和客观。因此，在进行科技语篇的汉英翻译时，必须在理解汉语原文的基础上，根据英语科技语篇的特点，适当使用名化结构，以使译文更加符合译入语——英语的语言习惯。本文采取语料库的研究方法，对比研究非翻译语料库（EP.FLOB）和翻译语料库（TR.SAC）中各名化现象使用情况的异同，发现在名化结构的使用上存在差异，主要不同有以下三点：1) EP.FLOB中名化结构使用频率排列为：名词性从句>不定式>动名词；TR.SAC排列为：动名词>不定式>名词性从句；2) 与英语母语者相比，英语学习者更多的使用“名化+of+名词”这个结构，根据对比SAC平行语料库发现，这主要是受汉语中高产名化结构“…的…”的影响。并且与英语母语者相比，英语学习者更多的使用动词派生名化词，但通过类型符统计，发现英语学习者词汇选择模式单一；3) 英语学习者更倾向使用动名词形式名化，首先因为其高产性，大部分的动词都可采用该名化形式；其次是在翻译时，译者受汉语动宾结构的影响而采用该形式。作者认为造成这种情况的主要原因有：1) 英语学习者对英语名化结构认知单一；2) 在运用名化结构时受母语汉语的影响。最后本文通过实例分析探讨了适用于科技语篇汉译英的翻译原则和技巧。依据中英两种语言的差别以及名化结构的差别，本文提出译入语导向原则和信息导向原则，汉语的名词、动词或形容词可以翻译为相同的级阶上的名化结构，汉语的单句也可降级翻译为名化结构，同时通过名化结构可以更好地确定主语推进科技语篇进展。虽然无法详尽地探讨科技英语中的名化现象，但希望本研究可以加强科技英语学习者对名化现象的理解，为科技英语阅读、写作以及科技翻译教学和学习提供参考。﹀
外文摘要：	︿ Since China has been involved in globalization, Chinese-English technical translation has revealed its importance under the increasing of science and technology exchange all over the world. As a significant feature in EST, the nominalization structures can make the discourse with conciseness and objectiveness. Therefore, based on the understanding of Chinese texts, the nominalization structures must be used properly according to the features of EST in Chinese-English translation, then to make the translation more in line with the language customs in target language.The present study adopts a corpus-based research method to contrast nominalizations in EP.FLOB and TR.SAC, and founds several differences as follows: 1.For the native English speakers, frequency ranking of use of nominalization types is: Noun Clause > Infinitive > Gerund; for English learners, it is: Gerund > Infinitive> Noun Clause. 2.Compared with native speakers, English learners tend to the Nominalization + of +Noun structure which is influenced by the high-yield nominalized structure “…的…” in Chinese according to the concordance in the parallel corpus SAC; English learners trend towards the deverbative nominalization contrasted with native English speakers; by the statistic on type of those deverbatives, however, the selection mode is monotonous in TR.SAC.3.English learners tend to the gerund for two reasons: firstly, the gerund can be transferred from the most verbs, and secondly, its selection is under the influence of the Verb-Object construction in Chinese.The main reasons for the above differences, according to the author, are the monotonous cognition on English nominalization and the influence of mother language.Finally, it goes to the specific principles and techniques suitable for the translation based on translation examples. According to the different linguistic features of Chinese and English and different structures of nominal group, nouns, verb and adjectives in Chinese can be converted into nominalizations on the same rank; or nominalizations can be realized as nouns, verbs and adjective on the same rank, or a sentence into clauses on a lower rank in English; and the nominalization structures can be the best way for defining the subject.Although this paper cannot exhaust the phenomenon of nominalization in EST, the research is to enhance the comprehension of nominalization of EST readers and to provide the reference for EST readings and writings, and the teaching and learning of translation of scientific texts. ﹀
分类号：	H085
论文总页数：	72
参考文献总数：	44
馆藏号：	017/M2011(677)
公开日期：	2011-11-28

面向国际汉语教学的成语应用偏误研究及成语学习知识库的设计与建设.张一宁

链接

题名：	面向国际汉语教学的成语应用偏误研究及成语学习知识库的设计与建设
姓名：	张一宁
学号：	10817491
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
论文答辩日期：	2011-11-28
外文题名：	Error Research and the Design and Construction of Idiom Learning Knowledge Base for International Chinese Language Teaching
关键词：	国际汉语教学偏误研究成语学习知识库
外文关键词：	International Chinese Language Teaching Error Research Idiom Learning Knowledge Base
论文摘要：	︿随着中国与世界各国在教育、科技、经济、文化等领域交流与合作的不断深入，汉文化在世界范围内广泛流行。成语是中华民族文化的瑰宝，是汉语言的精髓。在国际汉语教学中，成语的学习和掌握对于理解汉文化，提升汉语水平有重要的意义。虽然成语在汉文化中有其重要的地位，但是国际汉语教学中成语的学习情况并不乐观。通过对“北京语言大学HSK动态作文语料库”中225处成语应用偏误情况的研究，我们发现国外汉语学习者在汉语成语应用过程中产生了成语语义、语法、感情色彩、适用对象和范围、搭配等诸多偏误。为了更有效地服务国际汉语教学，帮助学习者掌握成语的含义及应用方法，并且能够通过成语了解汉文化知识，本文在深入分析成语应用偏误情况的基础上，结合汉语成语固有的特点，运用语料库的统计方法，设计并建设了一个面向国际汉语教学的“成语学习知识库”。“成语学习知识库”包含了18个属性字段，知识量丰富。其中成语语法功能、部分例句、成语常用搭配及频率、适用对象和范围等内容来源于“北京大学中国语言学研究中心CCL语料库”的检索和统计，这种处理方法最大程度地保证了成语信息的真实性和准确性。“成语学习知识库”的建立有助于学习者全面掌握成语知识，提高国际汉语教学中成语教学的教学质量，也可以对汉文化在国际范围内的传播起到促进作用。﹀
外文摘要：	︿ As China enhances its cooperation with other countries in education, economy, culture, science and technology, Chinese culture has become popular increasingly worldwide. Idiom is the treasure of Chinese nation, as well as the essence of Chinese culture. Therefore, learning and mastering idioms is significant for the understanding of Chinese culture and the improvement of language level for a Chinese learner during International Chinese Language Teaching. Although idioms play a significant role in Chinese culture, the use of idioms has a long way to go. By studying 225 errors in “Beijing Language and Culture University HSK Dynamic Composition Corpus”, we found that foreign students often have problems in idiom application. Their errors mainly lie in semantic understanding, syntax function, emotional coloring, applicable objects and scope, frequent collocations and so on. In order to serve International Chinese Language Teaching more effectively, “Idiom Learning Knowledge Base” has been constructed. It is designed to help Chinese learners master the meaning and usage of idioms, and thus know more about Chinese culture.“Idiom Learning Knowledge Base” has 18 property fields. Among its rich content, idiom grammatical function, parts of examples, idiom frequent collocations and its frequency, idioms’ applicable objects and scope come from the searching and statistics of “Center for Chinese Linguistics PKU Corpus”. This approach guarantees the information’s authenticity and accuracy to the largest extent. The establishment of “Idiom Learning Knowledge Base” can help language learners master idiom knowledge all-roundly and thus improve International Chinese Language Teaching level. It will contribute to the global communication of Chinese culture in the meantime. ﹀
分类号：	H087/TP391.12
论文总页数：	63
参考文献总数：	0
馆藏号：	017/M2011(682)
公开日期：	2011-11-28

英汉语篇衔接及其翻译.刘劲松

链接

题名：	英汉语篇衔接及其翻译
姓名：	刘劲松
学号：	10817239
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2011-11-28
外文题名：	English-Chinese Textual Cohesion and its Translation
关键词：	语料库语篇衔接翻译比较
外文关键词：	Corpus Textual Cohesion Translation Comparison
论文摘要：	︿随着计算机信息技术的普及和飞速发展，语言研究人员开始重视运用计算机及信息技术手段来研究语言，翻译以及从事语言教学，语言翻译研究手段也越来越注重统计方法和语言数据的支撑。语料库方法即是最近语言学界使用最为有效的方法之一。语料库方法运用于语言学研究即为通过收集大量真实的语言数据并对其进行充分客观的描写和解释，这是语料库语言学的任务。随着翻译文本地位的确立，翻译文本取得了与源语文本同等的地位，为基于语料库的翻译研究奠定了基础。本文基于韩礼德和哈桑《英语衔接》的理论框架，参考国内学者有关英汉语语篇衔接的相关论述，充分结合汉语自身特点，运用英汉平行语料库以及WordSmith工具，Paraconc平行语料库检索工具和北京大学计算语言所研发的Concordancer检索器对英汉语篇衔接方式进行定量分析和考查，分析英汉语篇衔接及其翻译转换规律。笔者认为译者须从语篇出发，先整体后局部，充分考虑英汉语篇衔接的各种因素，以求理解准确和表达通顺地道。译者在解构原语语篇的同时，也在重构译语语篇，这时原语语篇的衔接方式被翻译转换成目标语内在的衔接方式，成为译语语篇内在的语篇纽带。翻译研究与实践应该重视词、句之间的内在衔接关系，重视语篇衔接方式的翻译，从根本上提高译文整体质量和可读性。最后，本文为基于语料库的语篇翻译学研究，以及平行检索器的开发和改进提出了自己的建议。﹀
外文摘要：	︿ With the rapid development of computer and information technology, linguists tend to use computers and information technology tools to study language and aid language teaching. Translation scholars become more interested in statistical methods and seek linguistic evidence with the support of linguistic data. The corpus is emerged as an empirical research method. Based on the Halliday & Hasan’s theoretic framework in Cohesion in English and the nature of Chinese language, the author uses English-Chinese corpora and corpus tools such as Wordsmith, Paranconc and Concordancer to investigate Cohesive devices both in English and Chinese texts, and then puts forward the ways to translate them into Chinese effectively, or vice versa. The author emphasizes that translators should consider textual cohesive devices at the level of words, phrases, sentences and paragraphs in view of text. In translation process, the translator is in fact reconstructing the target cohesive texts when deconstructing the original ones. Therefore, the cohesive devices in the original texts are translated into the corresponding target language. Practical and theoretic research into translation should pay more attention to cohesion, something beyond lexical words and syntax; therefore the overall quality and readability of the translation is greatly improved. At last, the author provides proposal on developing and improving the present parallel concordancer. ﹀
分类号：	H087/TP311.52
论文总页数：	60
参考文献总数：	130
馆藏号：	017/M2011(668)
公开日期：	2011-11-28

2011-06-05

古汉语文本自动句读研究.许京奕

链接

题名：	古汉语文本自动句读研究
姓名：	许京奕
学号：	10648886
论文语种：	chi
专业：	计算机软件与理论
公开时间：	1年后
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2011-06-05
外文题名：	Sentence Segmentation in Classical Chinese Texts
关键词：	古汉语句读自动标注机器学习
外文关键词：	Classical Chinese Sentence Segmentation Automatic Tagging Machine Learning
论文摘要：	︿汉语是典型的意合型语言,其研究的困难程度举世公认。相对于现代汉语而言，历时性是古汉语的突出特点。这种历时性加剧了数据不同分布和稀疏问题，使得古汉语信息处理研究进展缓慢。此外，未登录词问题和相关语言资源库的缺乏也影响了古汉语信息处理研究的进展。因此，如何在缺乏语言形态变化和存在历时性问题的条件下，建立可以应用于科研和工程的古汉语语言处理模型是众多研究者不断探索的目标。中国历史悠久，存世的文献数量巨大，其中绝大多数文献没有任何标点。如果给这些文献添加标点，不仅有利于人们阅读理解文献内容，同时也有利于更高层次的信息处理（如语义分析等等）。因此，本文将古汉语文本自动句读作为研究的主要内容，探索解决在古籍文献的文本中切分语句级单位的问题。采用何种方法研究古汉语是当前亟待解决的基础性问题。实践证明，西方语言学理论对于古汉语研究具有很好的借鉴作用，但采用完全照搬的方式并不总能很好地解决古汉语研究遇到的问题。因此，若能将前人成功的理论方法与古汉语自身特性结合起来开展研究工作或许可以取得更好的效果。无论在古汉语中还是在现代汉语中，汉字和词汇的界限、词汇和短语的界限以及短语和小句的界限都比较模糊。而这些问题恰恰是语言学研究的薄弱环节。如果因此将汉字等同于词汇来处理无异于掩耳盗铃。基于以上认识和前人的研究成果，针对古汉语文本自动句读这一具体研究课题，本文从最基本的环节（即汉字实际运用情况的环节）入手，基于语言学基本理论，结合信息科学技术，探索古汉语语言结构的奥秘。本文具体研究的内容和贡献如下：1、汉字聚类。汉字研究在古汉语研究中十分重要，然而，迄今为止，有关汉字类别的研究还不太成熟。这一方面与语料处理困难有关，另一方面也和缺乏必备的理论和技术积累有关。针对这一问题，本文所做的贡献是提出了一种基于汉字使用可替换度的聚类方法并依据该方法自动建立了汉字的类别体系。该方法首先基于“可替换”的理念，在语料中计算不同汉字间的可替换度，进而采用近邻传播聚类模型对汉字进行聚类，以期建立符合汉字运用规律的汉字类别体系。实验证明，在《史记》文献中基于该方法创建的汉字类别体系不仅体现了汉字实际运用的特点，同时也揭示了汉字背后隐藏的语法语义信息。最后，本文建立了汉字类别知识库，对后续研究产生有益的作用。2、古汉语语词级单位切分。在分词标准难以确定的条件下，本文将具有稳定搭配关系的汉字组合视为语词级单位。以历时性的观点，这比较符合汉语的成词规律。由于语词级单位是语句级单位的直接构成单位，因此，语词级单位的切分就显得尤为重要。针对当前研究的缺陷，在深入研究序列文本中语言单位左右搭配差异的基础上，本文提出一种更为精细的描述搭配关系的模型，并构建了实用的语句级单位切分系统。该模型是全文工作的核心与基础，也是本文的创新点之一。基于该模型，本文展示了句子内语词级单位的搭配变化趋势，为进一步研究语句级单位切分奠定了基础。3、古汉语语句级单位切分。相对于汉字聚类和语词级单位切分研究而言，语句级单位切分研究是集大成于一身的研究。在引入汉字聚类特征和语词级单位特征的基础上，考虑到邻接搭配强度间的关系反映了语意的完整表达程度，本文针对研究问题的特性，构建了邻接搭配强度间的关系特征，并建立起相应的有指导和无指导的切分模型，这是本文另一个重要的贡献。通过实验，本文比较了当前具有代表性的切分模型与本文提出的切分模型的性能差异，证明了本文提出的模型的有效性。本文不是从基于现有语法框架的角度而是从语言实际运用的角度出发，采用统计和机器学习的方法，研究汉字聚类和语词级单位切分问题。在此基础之上，构建了语句级单位切分模型和系统。该模型回避了相关理论缺乏（不完善）（如分词标准）的问题以及知识库匮乏的问题，揭示了文本数据背后隐藏的语法语义知识。2008年至2011年的科研和工程（本文成果已经成功应用于“资治通鉴分析系统”、“中国历代典籍总目系统”和“全球华人寻根网”等项目建设）实践证明，本文提出的方法、构建的系统和资源，不仅可以支持古汉语教学研究，同时也可以应用于相关工程建设。﹀
外文摘要：	︿ It’s well-known that Chinese is a typical parataxis language and Chinese study is very hard. As an important part of Chinese, classical Chinese has more significant diachronic features than modern Chinese. The non-identical distribution problem and the sparse problem are aggravated by the diachronic features. Besides these problems, the OOV problem and the resources shortage problem baffle classical Chinese information processing study. Consequently many researchers try to build a better language processing model on classical Chinese for scientific research and engineering under these diachronic features and the absence of morphological variations. There are so many classical Chinese literatures in China today. Most of them were written without any punctuation marks. If we can punctuate these literatures, it is very useful to read and to process them. Therefore, our research put the emphasis on sentence segmentation in classical Chinese texts. Nowadays one of the most pressing tasks is how to select some effective theories for classical Chinese study. Many theories of western Linguistics can be used as reference of classical Chinese, but it is not always a good solution to the problem by copy. Therefore the best way may be build a theory through combining existing achievements and characteristics of classical Chinese.In classical Chinese and modern Chinese, the bound is very vague between Chinese character and word. So is the bound between Chinese phrase and sentence. The solutions available to the bound problems are insensitive to many west linguists. Thus some researchers consider Chinese character as Chinese word when they carry out language research. It has been verified through practice that the way is not appropriate. As before, based on practical Chinese usage, we study classical Chinese Grammar for clause segmentation with some linguistic principia and the information science technology. The main contributions in my dissertation are as follows: 1. Chinese Character Clustering. Although Chinese character information processing is one of important foundations for Chinese information processing, those relative research on Chinese characters is still immature. The corpus processing is a barrier on the one hand, and lack of relevant technology is another. Therefore, based on Chinese character grammar and semantic substitutable degree, the paper proposes an unsupervised method to build the Chinese character knowledge system. Firstly，it calculates the grammar and semantic substitutable degree between any two Chinese characters in the corpus. Secondly, using affinity propagation algorithm, it clusters Chinese characters in the corpus. Finally, it will build a Chinese character knowledge system, which embodies the practical Chinese character usage rules. The experiment shows that the Chinese character knowledge system not only reflects the Chinese characters usage rules, but also reflects the grammatical and semantic information of the characters. The Chinese character knowledge system will play a useful role in the later processing. 2. Word and Phrase Level Unit Segmentation (WPLUSeg). Because the Chinese word segmentation criterion cannot be well defined, we will take the stable collocation string of classical Chinese as a WPLU. In the diachronic Opinion, our proposed method is consistent with the Chinese word formation principles. Because WPLU is the direct formation unit of the clause and sentence level unit in classical Chinese, WPLUSeg is very important to the syntax analysis. Based on some in-depth research on the difference of language unit collocation in classical Chinese texts, we state a more refined model to describe collocation and construct a practical WPLUSeg system. The system demonstrates the collocation trend of Chinese Character forming clause and sentence in classical Chinese, and lays a theory foundation for further research of the automatic classical Chinese sentence segmentation.3. Clause and Sentence Level Unit Segmentation (CSLUSeg). Compared with Chinese character clustering and WPLUSeg, CSLUSeg is the master of them. Because the adjacent collocation function betrays the completeness on meaning, we propose a supervised model and two unsupervised models for CSLUSeg based on adjacent collocation features, Chinese Character features and WPLU features. In the experiment, we compare those current segmentation models and our proposed segmentation models. The experiment results show our proposed models are more effective than others.In the dissertation, based on Chinese character clustering, WPLUSeg and CSLUSeg, we carry out our research on classical Chinese sentence segmentation with the statistical method and machine learning method. Our proposed methods avoid many troubles (such as Chinese word segmentation criterion and language knowledge base construction) and reveal the grammar knowledge and Semantic knowledge in data. Our achievements can not only support many classical Chinese linguistics researches, but also can be useful in some related engineering. ﹀
分类号：	TP181
论文总页数：	141
参考文献总数：	73
馆藏号：	048/D2011(30)
公开日期：	2012-06-05

2011-06-02

软件本地化中标点符号的翻译策略.黄河

链接

题名：	软件本地化中标点符号的翻译策略
姓名：	黄河
学号：	10817162
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	何卫
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2011-06-02
关键词：	标点符号语言的元功能修辞关系结构理论翻译
论文摘要：	︿标点符号是书面语的重要组成部分，因此翻译必须重视标点符号的转换，但是在翻译中，标点符号的丰富内涵意义往往被忽略。目前，在本地化翻译的过程中，译员普遍没有对标点符号引起足够的重视，认为标点符号仅仅起着语法功能，在译文中随心所欲地使用标点，或者是简单对等移植和转换。而本地化翻译作为科技翻译的一个分支，译文必须要有较高的准确性和规范性，软件界面文本的翻译还必须具有一定的人际互动性。译员的不重视和译文的高要求之间的差异，造成了现在标点符号在本地化中的翻译出现了规范性、准确性、连贯性和人文性方面的问题。本文在韩礼德提出语言的元功能理论上，结合Mann和Thompson提出的修辞关系结构理论，分析了主要的标点符号在联机帮助、手册和软件界面文本等软件本地化文本里的元功能，并根据纽马克的“等效”标准，提出本地化文本中标点符号的翻译策略。最后，设置了翻译测试文稿，同时交与学习过本文提出的翻译策略的译员和不知道此翻译策略的译员翻译，通过对比他们的译稿的翻译质量，证明了本翻译策略的有效性。﹀
分类号：	H085/TP311.5
论文总页数：	67
参考文献总数：	0
馆藏号：	017/M2011(097)
公开日期：	2011-06-02

2010-09-01

生物序列比对的概率图算法在并行通用处理器上的设计与实现.尹朝明

链接

题名：	生物序列比对的概率图算法在并行通用处理器上的设计与实现
姓名：	尹朝明
学号：	10717350
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-09-01
关键词：	概率图模型生物信息学通用图形处理器计算隐马尔可夫模型条件随机域模型
论文摘要：	︿随着人们在生物信息学领域研究的不断深入，生物数据库正以极快的速度膨胀，生物序列比对，为搜索生物数据、分析生物数据的基础任务，具有很大的加速意义。基于概率图模型的生物序列比对能够通过现实序列的经验来修正固定的序列比对的目标函数，但也存在着效率上的不足。本文分析了两种典型的概率图模型在生物序列比对中的应用，一种是基于生成式的隐马尔科夫模型，一种是基于区分式的条件随机域模型。两种模型在应用于生物序列比对时。基于条件随机域在训练过程中的迭代算法和隐马尔科夫模型在标注过程中的动态规划算法耗用了很大的时间。本文主要贡献如下：1：针对两种模型对应的算法的特点，设计了基于统一计算架构（CUDA）的适应于通用图形处理器（GPGPU）的单指令流多线程（SIMT）算法。2：并针对图形处理器的线程调度，和内存访问特点设计了并行算法的优化方案，包含了矩阵变形，内存访问局部化，计算类型分化等一系列方法。3：利用数据依赖关系的分析结果，设计了一种在训练条件随机域模型时可同时并行多个迭代的并行算法。4：同时，针对基于隐马尔科夫模型在比对长序列中出现的设备端内存不足的情况，设计了基于运算/通讯重叠的流算法和基于同源序列的长序列分割的算法。最后，本文在NVIDIA GeForce GTX 9800图形处理器上使用统一计算架构CUDA编程框架，对算法进行了实现，并在不同的操作系统环境和编译环境下对实验结果进行了比较和分析。实验证明，在不同的环境下基于通用图形处理器的并行算法都能够取得一定的加速效果，其中最优能够达到10倍以上的加速比。﹀
分类号：	TP301.6
论文总页数：	70
参考文献总数：	58
馆藏号：	017/M2010(186)
公开日期：	2010-09-01

2010-06-12

以均根匀度为中心的语言信息计量研究.张化瑞

链接

题名：	以均根匀度为中心的语言信息计量研究
姓名：	张化瑞
学号：	10208848
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	工学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2010-06-12
外文题名：	Quantitative Measure of Language Information Concentrated on Square-Mean-Root Evenness
关键词：	均根匀度均方匀度熵匀度分布均匀度散布系数
外文关键词：	Square Mean Root Evenness Root Mean Square Evenness Shannon Evenness Distributional Consistency Measure of Dispersion
论文摘要：	︿频度、匀度和信息熵是语言的计量研究中极为重要的指标，本文对均方匀度、熵匀度和均根匀度进行了系统的比较研究，并针对均根匀度进行了深入的探索，得到了二元均根匀度和多元均根匀度的合理定义、基于相似度的熵和基于隶属度的熵，并以均根匀度为依据检验了新发现的与汉语相关的三个基本统计规律。以下为具体结果：1. 梳理了以前的研究中使用的“分布率、使用度、通用度、散布系数（散布率、散布度）”等概念和名称，提出使用“匀度”一词来描述一个词语在语料库中分布的均匀程度，并将常用的三种匀度指标分类命名为“均方匀度”、“均根匀度”和“熵匀度”。2. 提出了检验匀度度量指标合理性的三条准则：a) 在n等分的情况下应该与人的直觉相符合，如只在m个等分中以相同次数出现则其匀度应该为m/n;b) 在将划分逐步进行合并时，匀度的值应该保持不减;c) 在非等分的情况下，将出现的相对频度相同的几个部分进行合并，匀度的值应保持不变。并以此准则对现有匀度指标进行检验，发现均方匀度和熵匀度均不符合，只有均根匀度完全符合以上三条准则。3. 把平方加权均根匀度应用于汉语广义虚词的界定，利用该指标验证了广义虚词应该包括传统虚词、方位词和动词中的形式动词、助动词和趋向动词的结论；并对形式动词的范围进行了适当推广，给出关于副词的“虚实之辨”的一个解决方案。4. 用二元均根匀度检验了以下统计规律：一个更好地描述汉字的频序关系的公式；一个更好地描述汉字的笔画数分布的公式——幂正态分布，平方根正态分布为其特例；对数正态分布对汉语句长分布(基于人民日报语料库)具有更好的描述能力。5. 从对阿罗不可能定理的公理基础的质疑出发，基于“只有可消解的不确定性才有对应的信息量”提出了基于相似度的熵和基于隶属度的熵，使信息熵的计算能够把随机性和模糊性统一在一起进行考虑。通过对均方匀度和熵匀度的检验发现其不符合本文提出的三条准则，通过对均根匀度的扩展使其也能用于二元和多元的情况，本文建立了一个评价和发展匀度度量指标的自洽体系。﹀
外文摘要：	︿ Frequency, evenness (or degree of uniformity, distributional consistency), and information entropy are of fundamental importance for quantitative research of language information. Root-mean-square (RMS) evenness, Shannon entropy evenness and square-mean-root (SMR) evenness are compared systematically and square-mean-root (SMR) evenness is investigated in detail in this paper. The reasonable definitions of binary SMR evenness and multiple SMR evenness are obtained. Three statistical laws of Chinese characters and sentences recently discovered by the author are tested by the binary SMR evenness. The main results are as follow:1. Concepts used in previous research, such as range, dispersion index/coefficient, usage, adjusted frequency/modified frequency, are compared and the term evenness is proposed for the measure of distributional consistency of a specific word in a corpus. Three kinds of measures of evenness are named as root-mean-square evenness, Shannon entropy evenness and square-mean-root evenness. 2. Three criteria for testing the reasonability of measure of evenness are proposed: a) When a corpus is divided into n divisions with equal size, the evenness of a word should be m/n if it occurs in only m divisions with equal occurrence count; b) While the divisions are combined, the value of evenness should not decrease; c) When several divisions with unequal size and the same relative frequency are combined, the value of evenness should remain unchanged. Evaluation shows that both RMS evenness and Shannon evenness violate all of these criteria, but SMR evenness conforms to all of them. 3. SMR evenness and weighted SMR evenness are applied as criteria for identifying whether a word or one class of words should be considered as generalized function words. The results show that most traditional function words are of high evenness. A constructive answer is given to the question “Are Chinese adverbs content words or function words?” 4. New models for the frequency-rank relation and strokes number distribution of Chinese characters are formulated. A new model for the sentence length distribution in Chinese (People’s Daily sampled) is formulated.5. Similarity based entropy and membership based entropy are proposed via query on axiomatic basis for Arrow’s impossibility theorem and the assumption that only resolvable uncertainty can provide information. Through the evaluation of RMS evenness and Shannon evenness and extension of SMR evenness to binary and multiple SMR evenness, a self-consistent system was constructed for evaluating and developing measures of evenness. ﹀
分类号：	TP391.1
论文总页数：	87
参考文献总数：	82
馆藏号：	048/D2010(06)
公开日期：	2010-06-12

2010-06-04

汉语文本中的隐喻计算研究.贾玉祥

链接

题名：	汉语文本中的隐喻计算研究
姓名：	贾玉祥
学号：	10648892
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2010-06-04
外文题名：	Metaphor Computing in Chinese Text
关键词：	隐喻识别隐喻理解隐喻生成知识获取机器学习
外文关键词：	Metaphor Recognition Metaphor Understanding Metaphor Generation Knowledge Acquisition Machine Learning
论文摘要：	︿自然语言中存在大量的非字面意义的表达，如隐喻、转喻等，这些表达的真正含义无法从字面上直接获得，有时其字面义是讲不通的。这给自然语言理解提出了挑战，成为自然语言理解必须攻克的堡垒，其中最主要的是隐喻。隐喻是通过一个事物来表达另外一个事物，体现着一种类比的认知或思维方式，长期以来是多学科研究的课题，包括语言学、修辞学、认知学、心理学、哲学等。因此，隐喻计算的研究一方面可以服务于自然语言处理，另一方面也具有多学科的意义。然而，由于从计算机科学的角度开展的隐喻研究还相对较少，本文工作具有很强的探索性。文章首先综述了隐喻计算研究的进展，重点探讨了隐喻计算研究的新成果。近几年来，在自然语言处理研究的大背景下，隐喻计算方面涌现出了很多新的工作，尤其在隐喻识别方面，机器学习方法和大规模知识获取成了新的亮点，隐喻理解和生成方面也有新的成果。本文研究汉语文本中的隐喻计算问题，从最主要的几种隐喻类型入手，探讨隐喻的识别、理解和生成方法，并尝试将隐喻计算应用于自然语言处理其他任务。主要内容如下：1、隐喻识别。针对名词性隐喻和动词性隐喻这两类主要的隐喻类型，分别提出基于词典的名词性隐喻识别方法和基于知识获取的动词性隐喻识别方法。机器学习方法为识别不同类型的隐喻提供了一个统一的框架，考察了机器学习方法在识别不同类型隐喻时的特点和效果。基于词典的名词性隐喻识别。综合利用词典中的语义距离和语义关系知识来识别名词性隐喻，考察隐喻与语义距离和语义关系之间的关联。并把该方法用于新奇隐喻和常规隐喻的区分。基于知识获取的动词性隐喻识别。利用大规模语料结合语义词典自动获取动词主语及宾语的优选语义类，过滤掉抽象语义类，得到字面语义类，基于字面语义类进行动词性隐喻的识别。隐喻识别的机器学习方法。利用支持向量机方法识别名词性隐喻、动词性隐喻及“像”的隐喻用法，考察了真实语料中的隐喻分布情况，比较了不同类型隐喻的识别效果。2、隐喻理解与生成。对于名词性隐喻中的“X是Y”类型，提出基于显著特征的隐喻理解与生成方法。利用搜索引擎从大规模网页中自动获取名词的显著特征知识，构建显著特征知识库，作为隐喻理解与生成的数据基础。并探讨了显著特征获取的统计方法。3、隐喻计算的应用。探讨了隐喻与自然语言处理任务情感分析之间的关系，尝试将隐喻计算用于情感分析。首先探讨了以情感方面的概念为目标域的情感隐喻。然后考察了普通隐喻表达所传递的情感倾向。最后，提出读者情感分类问题，即预测读者读完文本后会产生怎样的情感。从Web上自动获取大规模情感数据，考察不同情感的分布及关联，利用机器学习方法进行自动情感分类。本文方法主要是基于知识获取和机器学习，避免了手工知识库和规则方法的不足。本文工作也积累了一些语言数据资源，可以为隐喻计算、隐喻本体研究及其它相关研究提供支持。﹀
外文摘要：	︿ Non-literal expressions or figurative expressions are pervasive in human natural languages, such as metaphor, metonymy, etc. The true meanings of these expressions are always different from the literal meanings and sometimes the literal meanings are semantically abnormal. Non-literal expressions have become an inescapable challenge to natural language understanding, among which metaphor is the most important one. Metaphor is to express one thing in terms of another based on some similarities between the two things. In nature it is a basic cognitive instrument of human being and has been studied in many disciplines, including Linguistics, Rhetoric, Cognitive Science, Psychology, Philosophy, etc. Therefore, on one hand, the research of metaphor computing can serve to natural language processing; on the other, it has multi-disciplinary significance. However, metaphor researches from the perspective of computer science are relatively few and this work is highly exploratory. This thesis first reviews the progress of metaphor computing research, focusing on new achievements in recent years. Machine learning methods and automatic large-scale knowledge acquisition become popular in metaphor recognition. New methods are also put forward in metaphor understanding and generation. This thesis studies metaphor computing problem in Chinese texts. The contents include automatic recognition, understanding and generation of certain kinds of metaphor, and the tentative application of metaphor computing in natural language processing. The details are as follows: 1. Metaphor recognition. For the two main kinds of metaphor, nominal metaphor and verb metaphor, we propose lexicon based nominal metaphor recognition method and knowledge acquisition based verb metaphor recognition method respectively. Machine learning methods provide a general framework for metaphor recognition. So we apply a machine learning method to recognize several kinds of metaphor, comparing and examining the experimental results. Lexicon based nominal metaphor recognition. The semantic distance and semantic relation knowledge in lexicons are utilized to recognize nominal metaphor, and this method is employed to differentiate novel metaphor from fixed metaphor. Knowledge acquisition based verb metaphor recognition. A verb’s preferred semantics on its subject and object are automatically acquired based on the large-scale corpus and semantic lexicon. After filtering the abstract semantics we get the literal semantics. Verb metaphors are recognized based on the literal semantics. Machine learning based metaphor recognition. Support vector machine is used to recognize different kinds of metaphor, including nominal metaphor, verb metaphor and metaphor with obvious language marks like “Xiang”. Metaphor usage distributions in real texts are investigated and recognition performances are compared between different kinds of metaphor.2. Metaphor understanding and generation. For the “X is Y” style nominal metaphor, we put forward a salient property based unified framework for both understanding and generation tasks. The salient property knowledge is automatically acquired from the web with a search engine. The statistical methods for salient property knowledge acquisition are also discussed. 3. Applications of metaphor computing. Integrating metaphor computing into sentiment analysis is explored. Emotional metaphor is discussed, where the target domains are emotional concepts. A simple method is put forward to compute the opinion polarity in metaphorical expressions. Emotion classification from the reader’s perspective is discussed lastly. Reader emotion distribution and inter-relations between different emotions are studied on the data automatically collected from the web and machine learning method is proposed for automatic reader emotion classification. The methods in this thesis are mainly based on knowledge acquisition and machine learning, avoiding shortages of mannual knowledge bases and rule-based methods. Some language resources are built in this work, which can support researches on metaphor computing and other related work. ﹀
分类号：	H087
论文总页数：	139
参考文献总数：	0
馆藏号：	048/D2010(64)
公开日期：	2010-06-04

2010-06-03

新闻自动提取系统的设计与实现.彭必扬

链接

题名：	新闻自动提取系统的设计与实现
姓名：	彭必扬
学号：	10617312
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
论文答辩日期：	2010-06-03
关键词：	信息提取 DOM树标记窗视觉分块新闻网页机器学习决策树扩展DOM树
外文关键词：	information extraction DOM tree Tag window VIPS news web page machine learning decision tree expanding DOM tree
论文摘要：	︿随着互联网的迅猛增长，大量的网页中充斥着各种广告和无关链接，有用信息与无用信息混杂在一起，增加了正确定位信息的难度。为了应对这些问题，迫切需要一种自动化的技术帮助人们在海量信息中迅速找到自己真正需要的信息，网页文本信息自动提取技术正是解决这个问题的一种很好方法。具体到新闻网页上来说，如何高效准确地自动提取出新闻网页中的主体新闻，是目前研究的热点和难点。本文就这一内容进行了相关的探讨和研究。目前信息提取(Information Extraction)所采用的方法有很多，本文一一进行了介绍。然后本文提出了利用DOM树基于标记窗口的新闻正文提取算法。该方法主要思想是把底层的新闻HTML文档解析成为DOM树的形式，提取其中的标记对，每个标记对记为一个标记窗口，计算每个窗口的权重，再根据权重提取标记窗口的新闻文本。接着本文提出了在此基础之上进一步改进的基于视觉分块（VIPS）的新闻提取算法。该算法先利用视觉分块技术对新闻网页页面分块，然后分别对每个语义块计算其相关统计信息，得到新闻正文所在的语义块，再进行信息提取，以求达到更好的提取效果。然后本文对该算法做出了一定改进，利用决策树方法剔除提取出的正文语义块中的噪声，大大提高了提取的准确性。最后本文提出基于扩展DOM树的提取算法，利用同一新闻网站下的网页会出现大量相似文本噪声的特点，对DOM树节点进行属性扩充，再提取新闻正文，也取得了不错的提取效果。本文结合信息提取算法实现了一个新闻网页自动提取系统，主要功能就是从新闻网页中提取出新闻的标题和正文内容，然后以纯文本的形式保存下来，以备进一步的分析与研究使用。系统针对在新浪网易腾讯等新闻网站,还有银行学校等主页中新闻网页的测试，取得了令人比较满意的实验效果，表明方法切实可用。本文所提出的信息提取算法与其他算法相比，可以广泛适用于各种结构不同的新闻网页，不依赖于新闻网页的具体结构特征，具有较高的灵活性，同时在新闻网页提取过程中兼顾迅速和准确两方面的特点，无需人工干预。本文所实现的系统自动完成对新闻网页主题信息和标题的分析提取和保存，有一定实用价值。﹀
外文摘要：	︿ With the rapid development of Internet, a large number of pages are filled with all kinds of ads and links, and useful and useless information are mixed together. All these problems increase the difficulty of correct information location. People need an automated information technology to help them quickly find the information they really need. Web text information extraction technology is one kind of method to solve this problem. In terms of news pages, it is the hot and difficult spot to find out how to efficiently, accurately and automatically extract the main text of them. This paper made some study and discussion about this area.Currently there are many methods of information extraction, which are introduced in this paper one by one. Then the paper proposed a new information extraction algorithm based on the DOM tree and tag window, the main idea of which is to parse the HTML document to a DOM tree, extract the tag window, calculate the weight of each window, and then extract the main news text according to the weight. After this, this paper proposed a further improved news extraction algorithm based on VIPS technology. This algorithm uses VIPS technology to divide the news page into blocks, calculates relevant statistical data of the divided semantic blocks, figures out the block with main news data, and then extracts the information in order to achieve a better result. This paper then made certain improvements to the algorithm, using decision tree to extract the noise in the semantic block, which greatly improved the accuracy of extraction. Finally, this paper introduced a extraction method based on expanding DOM tree, of which the main idea is that the pages under one news site contains lots of similar text noise, so we expand the DOM tree with two new attributes in its nodes, and then extract news text based on that attributes. This method also achieved very good extraction results.After that, this paper introduced an automated web news extraction system, based on the two algorithms discussed, of which the main function is to extract main text from news web pages, and then preserve the text in the form of txt for further analysis and research.The system was tested at some news website, such as www.sina.com , www.163.com , and www.tencent.com , and some news pages at banks’ and universities’ website, achieved satisfactory results, showing that the two effective methods are practical in real projects. Compared with other information extraction algorithms, the algorithms proposed in this paper can be widely applied to various news pages with different structures, not depending on a specific structure of news pages. Being flexible, it can also extract information quickly and accurately from news pages without manual intervention. The idea of the system automatically analyzing, extracting and preserving the main information and topics of news pages in this paper is practically valuable. ﹀
分类号：	TP391.1
论文总页数：	72
参考文献总数：	27
馆藏号：	017/M2010(018)
公开日期：	2010-06-03

2010-06-02

计算机辅助翻译专业的课程研究与改革.韩依彤

链接

题名：	计算机辅助翻译专业的课程研究与改革
姓名：	韩依彤
学号：	10817141
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	何卫
导师1单位：	外国语学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-06-02
关键词：	计算机辅助翻译专业课程研究改革
论文摘要：	︿随着翻译市场的扩大，翻译工作者的任务量也随着增加，单纯地依靠传统手工翻译已经不能满足市场的需求，这就要求更多的翻译者使用CAT技术来辅助翻译。这一需求为计算机辅助翻译课程的开设奠定了基础，可以说没有本地化的快速发展，就没有计算机辅助翻译专业的诞生和壮大。本文首先介绍了中国本地化市场的概况，由于经济全球化的加速以及中国市场巨大的发展潜力，很多跨国公司纷纷进入中国市场，因此，中英文互译的需求也越来越多，庞大的市场需求说明了计算机辅助翻译专业开设的必要性和重要性。在第二章中笔者对课程设计的相关理论进行了详细的论述，其中包括课程的概念以及在课程设计中不可或缺的理论依据。计算机辅助翻译专业在我国还处于不断改革和完善的过程中，借鉴其他院校的经验在完善课程过程中起着至关重要的作用。由此，笔者在第三章中讨论了6个计算机辅助翻译专业的课程设置情况，其中包括欧盟LETRAC项目，香港中文大学，北京大学，利默里克大学，利兹大学和伦敦大学。通过对这六个课程的研究，笔者总结了其相似性和独特之处。针对以上6种课程的优势和不足之处，笔者以北京大学计算机辅助翻译专业的学生为调查对象，对三个年级的所有研究生了进行了问卷调查，调查内容包括学习需求调查、课程设置满意度调查以及期望增设课程调查。调查结果显示：学生在研究生学习阶段，大多会根据个人的职业规划有重点的、深入的学习相应的课程。以北京大学计算机辅助翻译专业学生为例，他们对现有课程的内容较为满意，而对部分课程的时间设置以及期望增设课程表现出了不同程度的需求。通过对学生学习需求的调研以及调查结果所反映的问题，笔者提出了计算机辅助翻译课程改革的设想，即半年学习本地化基础知识，学生选择学习方向后再利用半年时间学习专业知识，翻译/工程处理，这一方案就是 “1/2+1/2”设想，并用教学实验验证了这一设想。在第五章中笔者详细介绍了实验项目的目的，项目的内容和实际操作情况，并在项目中收集数据，通过对项目得出的数据进行比对和分析，得到了这样的结论：分方向学习的学生在翻译水平和工程处理上均高于二者同时学习的学生。这种课程安排从实际上解决了“浮”和“浅”的问题。﹀
分类号：	H059
论文总页数：	67
参考文献总数：	43
馆藏号：	017/M2010(272)
公开日期：	2010-06-02

主位述位理论在软件翻译中的应用.高辉

链接

题名：	主位述位理论在软件翻译中的应用
姓名：	高辉
学号：	10817125
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	何卫
导师1单位：	外国语学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-06-02
关键词：	主位主位推进翻译保留移位
外文关键词：	Theme Thematic progression translation keeping displacement
论文摘要：	︿软件国际化项目数量和规模不断扩大，对国际化项目的质量提出了更高的要求[1]。针对实习中面对软件翻译没有一个系统的指导原则的现状，本文引入了韩礼德功能语法理论中的主位与述位理论，提出了在软件翻译中应用主位与述位理论的假说，以提高软件国际化的水平和质量。以韩礼德的主位述位理论为依据，本文选取中英文软件用户手册共14篇作为语料，其中中文语料49311字，英文语料45504字，在借鉴国内学者研究的基础上，作者总结并分析了软件用户手册中的11种主位推进模式。其中作者得出了软件用户手册中新的推进模式：祈使句推进模式，述位对比推进模式，述位对应推进模式。作者还对14篇语料，分别进行定量分析，比较了中英文用户手册中各种主位推进模式在全部推进模式中所占的比例。通过对源语言推进模式的分析，作者发现中文语料和英文语料中主位同一模式都占比例很大，但英文语料的主位同一比中文的百分比大。中文和英文的非主位结构模式比重也很大，但以中文的为多。汉语中线性推进模很多，英语中的合并推进比汉语中的比例略大。这说明软件用户手册中，英汉两种语言在主位推进模式上有相似性。作者建议，在翻译时，要尽量保留原文的主位推进模式，这样助于译者更好的掌握信息内容在语篇中的分布情况；有时候要通过主位述位的移位，从来更好的突出重点，翻译出更切合源语的目标语言。﹀
外文摘要：	︿ The increasing scale of software internationalization brings about the demand of high-quality translation. During the internship period, the author found no systematic guiding translation approach. With the purpose to improve the quality of software internationalization, the author introduces Halliday's functional grammar theory and puts forward the hypothesis that thematic progression can be used in software translation.Based on Halliday's Thematic theory, the author chooses 14 English and Chinese software user manuals including 49311 Chinese characters and 45504 English words. Then the author concludes the Thematic progression patterns in them. The author also proposes three new patterns: imperative action pattern, Rheme comparison pattern and Rheme-correspondent pattern. Through quantitative analysis, the author compared the Thematic progression between English and Chinese corpus. The author draws the conclusion that constant Theme progression patterns and non-Thematic progression patterns account for a big percentage in both two languages. But constant Theme progression patterns in English manuals are a little bit more than that in Chinese manuals while non-Thematic progression patterns in Chinese are a little bit more than in English manuals. What's more, linear progression in Chinese is more than that in English while combined progression in English is more than that in Chinese. From the result we can see there are similarities of Thematic progression patterns between English and Chinese languages in software user manuals. And the author suggests Thematic progression patterns be kept in translation. Sometimes to produce the appropriate target language, displacement between Theme and Rheme is needed. Finally, the analysis of Chinese and English Thematic progression patterns can help the translator better grasp the centre information so as to gain a better understanding of the original text. The translator can also translate more accurately. ﹀
分类号：	H059
论文总页数：	42
参考文献总数：	24
馆藏号：	017/M2010(269)
公开日期：	2010-06-02

本科英语专业翻译教学模式的改进研究.温彬

链接

题名：	本科英语专业翻译教学模式的改进研究
姓名：	温彬
学号：	10817388
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-06-02
关键词：	英语专业翻译教学现状问题建议对策
外文关键词：	English major translation teaching current situation
论文摘要：	︿随着改革开放政策的不断深入和经济的蓬勃发展，各个领域对翻译人才的需求量越来越大，要求也越来越高，对翻译工作者的急切需求促进了翻译教学的发展。如何迅速有效地培养大批合格的译员，成为了我国翻译教育工作者需要认真思考解决的问题。近年来，中国翻译教学发展迅速，它在中国教育体系中占据着举足轻重的地位。然而，当前翻译教学体制的发展现状却不尽如人意，提高当前英语专业本科翻译教学的质量显得尤为重要。本文简要回顾了中国的翻译历史和翻译教学历史，深入分析了国内英语专业本科翻译教学的现状和存在的问题，并针对这些问题探讨如何进行翻译教学改革。本选题的研究内容主要有以下几个部分：第一章“前言”，从整体上介绍本篇论文的研究背景；第二章“文献综述”，简单回顾了国内的翻译历史和翻译教学历史以及翻译与翻译教学之间的联系；第三章“调查问卷设计与数据统计”，笔者针对本科英语专业翻译教学的现状，设计调查问卷并对数据进行统计分析；第四章“结合调查问卷分析国内本科英语专业翻译教学的现状及存在的问题”，主要从翻译教学的培养目标、翻译教材的选用、翻译理论的讲授、翻译技巧的讲授和翻译实践练习这5个方面出发；第五章“提出改进国内本科英语专业翻译教学的建议和对策”；最后一章进行总结。本文的重点在于提出翻译教学改革的建议和对策，笔者在建议本科英语专业翻译教学应该努力适应素质教育要求，建立“以发展翻译能力为中心”的翻译课程培养目标、编写符合时代要求的翻译教材、合理安排翻译教学内容、恰当分配翻译理论教学与翻译技巧教学之间的比例、注重文化因素对翻译教学的影响、加强翻译实践练习的基础之上，也指出当前的翻译教学应该注重运用新信息手段促进教学，如多媒体技术、网络资源、计算机辅助翻译技术等，从而使本科英语专业的翻译教学跟得上时代步伐，更符合社会的需求。希望本论文能为本科英语专业翻译教师在其翻译教学中提供有益的参考，为推动翻译教学的发展做出一份微薄之力。﹀
外文摘要：	︿ With the carrying out of the reform and open policy and the accession to the WTO, there is a great need for large numbers of translation personnel to satisfy modern society’s needs. In the meantime, the problems surrounding translation discipline construction have aroused the interest of numerous scholars and teachers. Therefore，how to efficiently cultivate a great number of qualified translators becomes a serious problem for translation instructors to deal with. In recent years, translation teaching is developing rapidly in China, and it has been in an important status in Chinese education system. However，the current situation is not satisfying. Thus，it becomes more and more important to improve the quality of translation teaching for English majors.This thesis gives a brief review of translation history and translation teaching history，analyses the current situation and existing problems of translation teaching, and finally discusses the innovative measures of translation teaching. This thesis is made up of six chapters. Chapter 1 is the preface, aiming to introduce the entire research background. Chapter 2 is the “Literature Review”, reviewing the translation history and the translation teaching history and the relationship in between. Chapter 3 is to design the questionnaires (teacher’s version & student’s version) and analyze the collected statistics. Chapter 4 points out the current situation and existing problems of translation teaching, from the angles of teaching objectives, teaching materials and teaching content (including translation theories, translation techniques and translation practice). Chapter 5 puts forward the proposals of translation teaching for the undergraduates of English majors. The last chapter is the conclusion. The emphasis of this thesis is to put forward the proposals of translation teaching. In order to meet the needs of quality-oriented education, the author suggests establishing the teaching objective of “focusing on translation ability”, compiling new materials to meet the social needs, reasonably arranging teaching content, properly distributing the proportions of translation theories and translation techniques respectively, paying attention to the culture influence upon translation teaching and reinforcing translation practice. Besides these aspects，the author also points out that translation teaching should emphasize new technologies in an information era，such as multimedia，internet and computer-aided translation. As a result，translation teaching for undergraduates of English majors can keep up to date and meet the social needs.Hope this thesis could provide some helpful insight into innovation for translation teachers of English majors. And it is expected to contribute to the promotion of translation teaching and the cultivation of high-qualified translators. ﹀
分类号：	H059
论文总页数：	96
参考文献总数：	64
馆藏号：	017/M2010(326)
公开日期：	2010-06-02

使用多媒体语料库改善英语交际能力教学的研究.刘晟

链接

题名：	使用多媒体语料库改善英语交际能力教学的研究
姓名：	刘晟
学号：	10717207
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
论文答辩日期：	2010-06-02
关键词：	体态语多媒体语料库英语交际
论文摘要：	︿英语是人类沟通所使用的一种自然语言，人们都必须通过学习才能获得语言能力。当英语作为第二语言供人们学习时，由于巨大的文化背景的差异和身体语言的不同，大多数人都感到学习英语的困难。作为沟通和交流的工具，英语的实用性很强，在语音传达之外，往往还伴随着大量的肢体动作——身体语言，这与听觉相辅相成，甚至通过对肢体语言的理解，能够更好更深入地了解对方所要表述的意思。作者试图从那些以母语为英语的人身上找出一些规律，诸如他们如何去“说”英语——不仅是发音，还有眼神、面部表情、肢体动作等等。通过计算机对大量的视频进行分析，最后找出其中的规律，从而建立一个多媒体语料库系统，将现有的英语视频资源及其字幕作为语料，用计算机技术合成成为多媒体语料库。通过视频语料中画面语言和人物动作的双重刺激，加深英语学习者对英语口语的理解，从而提升英语交际能力。作者把英语首先作为沟通和交流的工具来看，通过分析和研究，找出这个工具“外在”的特性，即除了语法、句型之外的特性。这些特性勾勒出日常生活中人们如何使用英语这个工具来进行沟通和交流的，具有很强的实用性。其次，如果进一步研究的话，会发现母语是英语的人，在不同的地区、不同的文化背景下，对英语及肢体语言的使用，也是有差别的，这部分差别不是本文涉及的研究和讨论范围，论文中将不做详细的阐述和分析。同时，为了科学地评估本文所涉及的研究内容，及本文所得出的研究结论，作者用实验的方法来评估多媒体语料库系统和传统英语教学方法对于提升英语理解效果的差异性，以及通过问卷调查的形式来了解英语学习者在使用本系统后，对于系统的满意度。实验结果发现：第一，利用计算机影像技术能够有效地将肢体语言存储分类，构建新型的多媒体语料库。第二，多媒体语料库教学比传统英语教学更有效。第三，通过问卷调查，用户对语料库满意度较好。﹀
分类号：	H087
论文总页数：	55
参考文献总数：	0
馆藏号：	017/M2010(109)
公开日期：	2013-06-02

2010-06-01

国内翻译公司的推广策略研究——以元培翻译公司为例.陈庆

链接

题名：	国内翻译公司的推广策略研究——以元培翻译公司为例
姓名：	陈庆
学号：	10817088
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	何卫
导师1单位：	外国语学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-06-01
关键词：	翻译公司营销策略元培翻译
论文摘要：	︿随着中国加入WTO后的高速经济发展，中国与国际间的合作和交流也在大幅度增加，相应对翻译的需求也呈快速上升趋势，而且需求层面也从原来的特定行业和系统转为了整个社会。同时，从事翻译服务的单位和人员也在迅速增加，但因为门槛很低，翻译的质量好坏没有标准，业界主要采用压低成交价格等低层面的竞争手段，导致市场陷入恶性循环；而且很多翻译公司主要采用熟人介绍的方式，一旦该熟人关系断裂或不需要翻译服务，它们就很难继续成活下去。这就要求翻译公司探索新的营销模式，按照市场规律进行商业化的包装和运作，以保证公司的长远发展，在此，本文将对这方面进行浅探。综合以上背景，本研究以国内翻译公司为研究对象，试图探讨它在成长过程中的相关以下问题：国内翻译公司在成长的过程中，应该怎样推出自己的企业？以什么形象出现？有什么定位？在这个过程中，将会出现什么样的障碍？应该怎样克服？在文献综述的基础上，本研究得到国内翻译公司发展的几个关键影响因素：营销推广情况、质量和价格的重要性、专业程度、价格和服务的定位、市场的开拓以及其它可能遇到的障碍，然后根据市场营销的经典理论框架4P策略（价格、产品、渠道、促销）对当前国内翻译行业进行了分析，获得对国内翻译行业的整体认知，再以元培翻译公司为例进行案例分析，获得重视营销推广公司的典型翻译公司的现状，最后进行了译员深度访谈和定性分析，得出研究结果。同时，本研究还发现了新的观点，并对所建议的营销策略进行了修正。本文的研究贡献在于详尽分析了国内翻译公司将会遇到的种种障碍，创新性地将专业咨询公司的网络口碑营销分析经验以及营销行业的模型和理论进行应用，提出了相应的营销策略，对于国内翻译公司推销自己、走向市场有较好的参考价值。﹀
分类号：	H059/TP311.52
论文总页数：	103
参考文献总数：	48
馆藏号：	017/M2010(259)
公开日期：	2010-06-01

计算机辅助翻译技术在公关行业双语资料管理中的应用.尹静

链接

题名：	计算机辅助翻译技术在公关行业双语资料管理中的应用
姓名：	尹静
学号：	10717351
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
论文答辩日期：	2010-06-01
关键词：	计算机辅助翻译技术公关行业翻译特点公关行业翻译资料管理系统
论文摘要：	︿公关行业的主要特点是做好社会组织与媒体的沟通工作，由于国内公关公司的工作人员和其服务的客户当中有很多外籍人士，因此翻译成为公关行业中顺利沟通的一个重要环节。目前公关行业缺乏一个能够对翻译资源和翻译工作进行有效管理的系统，造成了翻译资料管理混乱、翻译工作流程冗长、翻译质量和效率低下以及对公关传播资料分析不足的现状。在本文中，作者力图利用计算机辅助翻译技术为公关行业翻译资料管理和翻译流程管理提供更加有针对性的、优化的解决方案。并且通过设计、实现和初步测试公关行业双语资料管理系统，验证该解决方案的优点与不足。本论文依循“发现问题—分析问题—解决问题—验证问题”的思路，通过理论分析、系统设计与实现、测试验证三个步骤，进行了深入地研究和探讨。首先，在基础研究方面，作者在实践的基础上对公关公司翻译工作的内容、翻译的语言特点、翻译流程、翻译人员组成等进行了深入而全面的研究，归纳出公关行业翻译的特点和存在的问题，为公关行业双语资料管理系统的设计指明了目标。其次，作者对计算机辅助翻译技术之于公关行业的翻译进行了适用性和优化性探讨，在将计算机辅助翻译技术运用到公关行业翻译的同时，针对公关行业提出了独具特点的设计。最后，遵循需求分析、系统设计和系统实现的思路，设计并实现了公关行业双语资料管理系统。其中翻译记忆模块，任务管理模块以及关键词统计模块是本系统针对公关行业而设计的功能模块，也是本系统的亮点。作者将该系统初步部署于北京一公关公司内部，对系统的初步测试和考察结果表明，公关公司双语资料管理系统能够满足公关公司工作过程中翻译结果查找，翻译流程管理，以及查看公关效果评估等需求，同时也发现了系统设计中的不足，为今后系统的改善和发展指明了方向。﹀
分类号：	TP319.2/G434
论文总页数：	71
参考文献总数：	24
馆藏号：	017/M2010(187)
公开日期：	2010-06-01

用信息化管理提升翻译企业核心竞争力的研究–以北京元培翻译为例.王晓婷

链接

题名：	用信息化管理提升翻译企业核心竞争力的研究--以北京元培翻译为例
姓名：	王晓婷
学号：	10817370
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-06-01
外文题名：	A Study on the Improvement of the Translation Companies’ Core Competence with the Implementation of Information Management
关键词：	核心竞争力信息化管理业务管理信息化翻译工作信息化
外文关键词：	core competence information management information management in operation control information management in translation
论文摘要：	︿随着国际交流与合作日益增多，各行各业对翻译服务的需求也随之越来越旺盛。我国的翻译服务行业已取得了长足的进步，正在向着规模化、专门化、产业化的发展方向迈进。但是作为一个新兴的语言服务行业，我国整个翻译产业的发展尚处在初级阶段，国内市场翻译公司的规模以小企业居多，品牌效应不强，大部分翻译企业缺乏核心竞争力。翻译企业如何提高自身的核心竞争力，如何有效地利用现代办公管理软件和计算机辅助翻译技术，关系着翻译企业能否在国内外激烈的竞争形势下，把握并取得持续的企业竞争优势，铸就强大的翻译品牌，这也是本文研究的重点。本文的研究方法包括文献探讨、个案实践、调查问卷和专家访谈。文献综述部分阅读了大量相关书目、学术论文、调查报告和网络文章，重新梳理了相关理论，并分析了国内外翻译行业的信息化现状，以此作为本研究的理论依据和参考。个案实践部分是以笔者实习所在的公司—北京元培世纪翻译有限公司作为研究案例，系统地分析该公司的信息化管理实践。根据调查问卷和专家访谈，对信息化管理实践过程中易出现的问题进行归类、分析，并提出改进的建议和对策。最后，本文提出翻译企业的信息化管理战略，即实现业务管理的信息化和翻译工作的信息化。翻译企业的信息化，是翻译企业建立现代企业制度的过程，是变革传统生产方式的关键。本文研究结论对翻译企业的信息化改革有重要的启示意义。﹀
外文摘要：	︿ With the increasing demand for translation services in the international exchange and cooperation, the translation industry in China has made great progress in its scale, specialization and industrialization. However, as an emerging language service market, it still faces severe challenges. The development in office automation systems and computer aided translation technologies has brought new opportunities for translation companies in the fierce international competition. This study focuses on how to improve the core competence of translation companies with the implementation of the information management. The research methods include literature review, case study, specialist interview and survey. After literature review, this study defines the core competence of translation companies, elaborates the information management strategies, and summarizes the development in information management of the translation industry at home and abroad. The case study gives a brief review of the implementation of information management by Yuanpei Century Education & Technology Co., Ltd..Based on the results of survey and specialist interview, this study draws the following conclusions: the implementation of information management is important for translation companies to maintain their lasting competitive advantages and build brands. To transform the management mode, more efforts should be made in operation control and translating process. Regular training and requirement analysis will help solve the problems occurred in the implementation of information management. ﹀
分类号：	H059/F270.7
论文总页数：	82
参考文献总数：	34
馆藏号：	017/M2010(320)
公开日期：	2010-06-01

翻译技术课程的教学实践研究.王华树

链接

题名：	翻译技术课程的教学实践研究
姓名：	王华树
学号：	10817348
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-06-01
外文题名：	Studies on Teaching Practice of Translation Technology Courses
关键词：	建构主义翻译技术教学计算机辅助翻译任务驱动教学项目驱动教学
外文关键词：	Constructivism translation technology teaching computer-aided translation task-driven teaching model project-driven teaching model
论文摘要：	︿信息化时代语言服务发生了翻天覆地的变化，翻译市场呼唤具备信息技能和翻译技能的复合型翻译人才。然而，国内翻译技术教育还处在萌芽阶段，新一代译员的培养模式尚在摸索之中。在新时代语言服务发生巨大变化的背景下，本文开篇分析了国内外翻译技术教育现状与问题，并结合北京大学计算机辅助翻译专业概况，引出了本文讨论的重点以及研究意义。在本文中间部分，作者先阐述建构主义与翻译技术教学之间的关系，然后以翻译技术实践和综合实践课程为例，结合作者的助教实践，分析、说明建构主义教学理论指导下的翻译技术教学课程设计思路、课程内容以及教学实施情况，并在助教工作实践中探讨如何通过任务驱动和项目驱动的模式组织和实施翻译技术教学。接下来，作者介绍了在建构主义教学理论框架下，以学习者为中心的教学理念在北京大学计算机辅助翻译专业方向翻译教学中的应用，探讨如何通过网络互动教学模式来发挥学生的积极性，指出多样化的教学方式和手段在发挥学生积极性和提高翻译技术教学效率方面具有重大的意义。文章最后介绍了北京大学计算机辅助翻译专业翻译技术教学评估情况，总结了翻译教学方面值得借鉴的一些经验，针对助教实践中遇到的一些问题进行分析，指出了翻译技术教学面临的多方压力和挑战，并尝试提出了应对方案。本文通过整体介绍北京大学计算机辅助翻译专业的翻译技术课程体系和教学实践，探讨如何更好地进行翻译技术课程教学的组织和实施，希望北京大学计算机辅助翻译专业的教学实践能够为其他高校翻译技术教学或者翻译教学研究工作者提供参考和借鉴。﹀
外文摘要：	︿ In the information age, language service has undergone tremendous changes and the translation market calls for all-round translation talents who are well equipped with information technology skills and language proficiency. However, translation technology education in mainland China is still in its infancy and how to train a new generation of talents for modern language service industry needs further discussion. Firstly, by analyzing the current problems with translation technology teaching both at home and abroad against the background of the dramatic changes of language services, the author draws out the focus and research significance of this thesis in combination with the computer-aided translation education in Peking University.In the main body, the author starts with clarifying the relationship between Constructivism and translation technology teaching, and based on the two practical courses in the Master’s Program at Peking University, illustrating the translation technology curriculum design and teaching practice guided by constructive pedagogy, and attempts to discuss how to arrange the teaching with task-driven and project-driven model. After the introduction of the teaching practice of learner-centered teaching model in the Master’s Program under the Constructivism frame, the author discusses how to arouse students' enthusiasm with the benefit of student-centered interactive teaching model, indicating that diversified means of teaching are of great significance in improving the efficiency of translation technology teaching.Finally, the author evaluates the program and sums up the teaching experiences and brings forward some problems encountered in his teaching practice, and tries to provide feasible solutions that may serve others.By introducing the overall curriculum system and teaching practice in the Master’s Program of computer-assisted translation at Peking University, the author attempts to explore better ways of teaching with the hope that the practice of translation technology could be instrumental for other scholars and researchers. ﹀
分类号：	H059
论文总页数：	71
参考文献总数：	47
馆藏号：	017/M2010(316)
公开日期：	2013-06-01

2010-05-28

一个即时通信翻译系统的设计与实现.高天奇

链接

题名：	一个即时通信翻译系统的设计与实现
姓名：	高天奇
学号：	10817128
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	何卫
导师1单位：	外国语学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-28
关键词：	即时通信翻译服务 Web Service 翻译插件翻译记忆
论文摘要：	︿近年来，即时通信软件伴随着互联网的高速发展，已经成为了网络上必不可少的通信方式。随着全球化的加快，世界各地的即时通信用户进行沟通的机会越来越多，用户需要一种方便快捷，准确有效的即时通信翻译工具。但是现有的即时通信翻译工具的功能过于简单，不支持群组通信的消息翻译，也不能提供专业的人工翻译服务。本文围绕着用户的实际需求，从用户需求的角度分析并叙述了一个即时通信翻译系统的设计与实现。为了实现即时通信翻译系统，本文设计并实现了即时通信客户端翻译插件，即时通信翻译服务器以及一套完整的后台翻译系统管理程序。本文实现的即时通信翻译系统特点有：1.支持多种语言的1对1即时消息翻译，并提供给用户多种选择；2.在1对1即时消息翻译中提供翻译记忆的功能，给用户良好的翻译体验；3.支持群组即时消息的翻译，在群组中可同时提供多达5种语言即时消息的翻译；4.支持译员协助功能，允许专业译员加入进行翻译，以提供高质量、可信赖的译文；5.提供了实用的后台翻译系统管理程序，便于译员的管理和翻译记忆的管理。本文提出的即时通信翻译系统的设计方案合理，系统架构清晰，高效实用，可以为世界各地的不同语言的即时通信用户提供方便快捷，准确高效地即时通信翻译服务。﹀
外文摘要：	︿ In recent years, as the rapid development of the Internet, instant messaging has become a necessary way to communicate with each other. With increasing globalization, instant messaging users have more chances to communicate with people throughout the world. Instant messaging users need a convenient, quick, accurate and effective translation tool. But the available translation tools for instant messaging have the problem of singleness in function. Neither supports the group instant messaging translation, nor provides professional human translation service. Around the needs of users, I describe the detailed design and the implementation of an instant messaging translation system according to users’ needs in this paper. In order to implement this instant messaging translation system, I design and implement a translation plug-in as client, a translation server and an integrated background translation system management program.The key features of the system are: 1.Support multilingual translation in 1 to 1 instant messaging and the user has multiple options for flexible settings. 2. Support using translation memory in 1 to 1 instant messaging translation, and provide good user experience. 3. Support group instant messaging translation and can translate up to five kinds of languages simultaneous. 4. Support translator assisted translation, translator join in the chat room to provide high quality and trustable translation. 5. Provide a practical background translation system management program to manage translator and translation memory.The design proposal of this system is reasonable, and the architecture is clear. The system is high efficiency, and can provide convenient, quick, accurate and effective translation service for instant messaging users. ﹀
分类号：	TP393.02/TN915
论文总页数：	76
参考文献总数：	23
馆藏号：	017/M2010(270)
公开日期：	2010-05-28

文本过滤及其在问题过滤中的应用.李吉

链接

题名：	文本过滤及其在问题过滤中的应用
姓名：	李吉
学号：	10617229
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-28
外文题名：	Text Filter and its application in question filter
关键词：	文本分类文本过滤特征选择向量空间模型
外文关键词：	Text Classification Text Filter Feature Selection VSM
论文摘要：	︿互联网的迅速发展使得各种数据急剧膨胀，在给人们带来方便的同时，也出现了信息过量、色情和暴力及其他不良信息充斥互联网等许多问题，亟待解决。作为解决这些问题的基本工具之一，文本信息过滤技术得到了广泛关注。中文文本过滤是中文信息处理的一个分支，它是根据用户的需求，在大量信息中搜索用户感兴趣的信息，并屏蔽其他无用信息。而传统的基于关键字或IP地址等过滤方法已经不能很有效的解决这些问题，由此本文对文本过滤方法进行了研究，希望能对信息内容进行分析，达到对网络信息的过滤。本文介绍了一种基于文本分类的文本过滤模型。首先简要介绍了文本过滤算法的关键问题，包括文档表示和特征选择算法，然后给出了文本过滤模型的总体设计，先对收集的文本进行改进的预处理工作，尽量减少对分词的干扰。其次借助向量空间模型的思想，将文本表示为向量形式，然后根据用户的过滤需求，从用户预先收集的训练样本中提取出信息特征过滤模型，再根据待测文本与过滤模型的匹配情况来判定待测文本是否满足用户过滤需求。最后将本文的模型用于问题过滤，实验结果表明该系统能够有效过滤垃圾问题，同时能减少合法问题的误判，在对垃圾问题进行过滤时具有良好的性能。不过要使文本信息过滤智能化，还是一个很复杂的过程，有待进一步的研究。﹀
外文摘要：	︿ With the advent of the Internet, the amount of electronic data increases dramatically. When it brings us convenience, it also brings some problems including information overload, porn and violence information and so on. As one of the basic tools to deal these problems, text filtering has drawn much attention. Chinese text filtering is a branch of Chinese natural language processing. It searches the useful information and eliminates the useless or irrelevant information in the dynamic data according to users' request. But the traditional filtering technology, such as based on keywords or IP address filtration cannot solve these problems effectively now. So the paper carried on research to the text filtering, in order to filter the dirty information on the Internet. The paper applied text classification to text filtering domain, it proposed a kind of filtering method based on text classification technology. First briefly introduced some key technology about text filtering including how to express a document, how to select the features of a document, and the text classification we used in the paper. Secondly, the paper drew support from the thought of the vector space model that set up the vector space form of the text document. Then the system sets up the information characteristic filtering model according to users' filter demand. We use the filter model into question filtering, and the results of the experiment shows that our system can filter spam questions effectively, and it can also reduce the misjudgment of the legal questions with good performance. But it is also complex to make anti-spam intelligence, and we need further study. ﹀
分类号：	TP391.12
论文总页数：	67
参考文献总数：	39
馆藏号：	017/M2010(014)
公开日期：	2010-05-28

基于隐马尔可夫模型的股指预测研究.田健南

链接

题名：	基于隐马尔可夫模型的股指预测研究
姓名：	田健南
学号：	10617343
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-28
关键词：	股票预测经济因素隐马尔可夫模型维特比算法
论文摘要：	︿近20年来股票市场在我国不断成长，逐步成为证券业乃至整个金融业必不可少的组成部分，并且受到越来越多投资者的关注，因而对股票指数走势的分析和预测都有重大的理论意义和客观的应用价值。本文的目的在于利用隐马尔可夫模型对股指未来走势进行研究。第一章：简述了课题研究的内容和意义以及股票指数的相关概念，总结回顾了股指预测的常用方法、研究现状及存在的问题。第二章：介绍了隐马尔可夫模型及其在词性标注领域的应用。第三章：分析了股指和经济形势的关系，并且以此作为理论基础利用隐马尔可夫算法建立预测模型。第四章：选择具有较好市场代表性的股票指数，利用第三章建立的模型，对指数进行预测。预测结果表明，文中所建立的模型比较精确，可将其用于实际股价指数预测，为投资者提供一定帮助。第五章：总结了全文的工作成果和对未来工作的展望。此外，由于宏观经济和股票价格指数的复杂性及笔者研究水平的局限，本文提出的模型仍然有许多亟待完善之处，希望以后能继续进行深入研究。﹀
分类号：	TN912.3/F832.5
论文总页数：	52
参考文献总数：	30
馆藏号：	017/M2010(019)
公开日期：	2010-05-28

面向计算机辅助翻译的自动化译前准备研究.张永伟

链接

题名：	面向计算机辅助翻译的自动化译前准备研究
姓名：	张永伟
学号：	10717388
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-28
关键词：	计算机辅助翻译句子自动对齐关键短语关键句自动提取
外文关键词：	Computer-Aided Translation Automatic Alignment of Bilingual Sentences Key Phrases Key Sentences Automatic Extraction
论文摘要：	︿随着经济全球化的迅速发展、国际合作的不断深入，需要翻译的文本数量不断增多，同时参与一个翻译项目的译员人数也在不断增加。如何在这种环境下，高效的翻译这些文本显现尤为迫切与必要。显然，计算机等各类信息技术工具是我们唯一可以信赖的支柱，很多研究工作围绕提高计算机辅助翻译系统的效率而开展。作为计算机辅助翻译自动化译前研究的子任务，本文从三个方面做了深入讨论与研究，分别是双语句子自动对齐与检索，以及关键短语和关键句的自动提取。双语句子自动对齐可以将容易获取的篇章级对齐的双语文本自动对齐为句子级别对齐的文本。由句对齐语料组成的翻译记忆库基础性语言资源，对计算机辅助翻译的应用具有重大的意义。关键短语提取工具可以自动提取文本内的术语/关键短语，在翻译实务中可以为译员提供词汇、术语和短语层次的辅助支持，是大规模团队翻译工作的术语一致性保证的最重要支持。关键句提取工具可以提取文本内指定数目的关键句子（包括对重复性句子的加权），有助于译员迅速理解文章主题。在翻译任务尤其是大规模文本翻译任务开始前，由经验丰富的译员率先翻译出来，还可以协助整个翻译团队确定译文风格、保持译文一致性，并最终提高译文质量。﹀
外文摘要：	︿ The rapid development of economic globalization and deepening of international cooperation brought the increase of the number of the texts to be translated as well as the number of translators in a translation project. Under the circumstances, it becomes particularly urgent and essential to translate those texts efficiently. Obviously, various information technology tools such as computer can be what we can rely on most. As a result, many studies are carried out focusing on improving the efficiency of computer-aided translation system. As a sub-task of the pre-translation study of computer-aided translation automation, this thesis will do further discussion and research from the following three aspects: the automatic alignment and retrieval of bilingual sentences, automatic extraction of key phrases and automatic extraction of key sentences.Automatic alignment of bilingual sentences can automatically align the easily accessible bilingual texts aligned at chapter-level to texts aligned at sentence-level. The translation memory basic language resources which consist of bilingual texts aligned at sentence-level is of great significance in the application of computer-aided translation The tool of automatic extraction of key phrases can automatically extract the terms or key phrases from the texts. It can provide auxiliary support on glossary, terms and phrases for translators, which can be the most important support to guarantee the consistency of terms in large-scale team-work translation. The tool of automatic extraction of key sentences can extract key sentences from the texts in a specified number, including the weighing of repetitive sentences. It can help translators quickly understand the subject of the text. Before the beginning of translation project, especially the large-scale translation project, the key sentences are translated by experienced translators in advance so as to determine the translation style and maintain consistency in translation. Consequently, the quality of the translated text will be enhanced. ﹀
分类号：	TP391.2
论文总页数：	69
参考文献总数：	51
馆藏号：	017/M2010(208)
公开日期：	2010-05-28

基于用户分类与行为分析的旅游类智慧型搜索用户模型的研究.孟凡亮

链接

题名：	基于用户分类与行为分析的旅游类智慧型搜索用户模型的研究
姓名：	孟凡亮
学号：	10717230
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-28
关键词：	旅游行业信息检索个性化用户模型
论文摘要：	︿由于“信息过载”和“信息迷向”现象的出现，旅游电子商务遇到了新的挑战。本文通过对旅游行业用户需求以及搜索引擎个性化推荐方法及相关技术的研究，设计了基于人群分类和用户行为分析的旅游类智慧型搜索用户模型。本文的主要贡献是：1. 根据旅游行业的特殊性，提出了根据用户的旅游综合目的来区分用户群的方法，根据人群的划分，可以为用户挖掘和推荐潜在的需求和信息，实现了信息的相关性和兴趣关联性的融合；2. 设计了基于用户、长期兴趣、短期兴趣的用户模型，可以持续的跟踪用户兴趣的变化，并根据用户兴趣的变化调整兴趣和网页类型的权值，实现了贴身化设计的出发点；通过实验，我们针对旅游行业进行兴趣点聚类，得到1863个旅游行业类别，为用户长期兴趣的选择提供辅助；我们为旅游行业用户的每一个长期兴趣分配68个关键词容量，用于跟踪记录用户的短期兴趣变迁；通过用户平均置换关键词次数和用户平均满意度的对比评测，我们选择了时钟算法和NFU算法作为关键词置换算法；通过同类产品对比评测，我们基于用户分类和行为分析的方法，可以在消耗时间、信息有用性比例上得到改善。根据用户的兴趣和特点，对信息资源进行收集、整理和分类，向用户提供和推荐符合其兴趣偏好或需求的信息或服务。﹀
分类号：	TP393
论文总页数：	61
参考文献总数：	33
馆藏号：	017/M2010(122)
公开日期：	2010-05-28

英语动词性隐喻识别研究.徐建

链接

题名：	英语动词性隐喻识别研究
姓名：	徐建
学号：	10817412
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-28
外文题名：	Research on Identifying English Verb Metaphors
关键词：	隐喻识别依存关系概念距离词语聚类词语相似度
外文关键词：	metaphor identification dependency relation conceptual distance word clustering word similarity
论文摘要：	︿隐喻句的自动识别和理解是计算语言学领域的一个重要研究课题。其对机器翻译、信息检索、人工智能、问答系统等领域有着重要的影响。本文以中国电子信息产业发展研究院（CCID）的英文语料为基础，从以下几个方面对英语动词性隐喻句进行识别研究：（1）从句子层级出发，利用句法分析工具获取句子的主语、谓语和宾语。本文通过Stanford大学的句法分析工具获取句子的主语、谓语和宾语以及它们之间的依存关系，确定本文研究的重点：以英语动词为中心，探索主语、宾语同谓语动词的搭配规律。（2）采用统计方法计算主语和谓语、谓语和宾语之间的概念距离，依据概念距离的大小，输出候选的隐喻句子。本文利用n-gram和对数似然方法计算主语和谓语、谓语和宾语的概念距离，依据概念距离的大小，发现语料中的隐喻现象；使用数据平滑的方法解决统计模型中的数据稀疏问题。另外，本文研究主语、宾语的同位词、上位词同谓语动词之间的概念距离，提高了隐喻识别的准确率和召回率。（3）利用现有隐喻资源，采用基于WordNet的方法计算词语间的相似度以及利用K-means方法进行词语聚类，扩大了隐喻识别范围。本文使用Master Metaphor List、TroFi Example Base语料作为基础的隐喻资源。通过Stanford大学的句法分析工具获得基础资源中主语和谓语、谓语和宾语之间的依存关系。然后引入K-means方法对中国电子信息产业发展研究院（CCID）的英文语料中的主语、宾语进行词语聚类，获得相应的聚类结果。接下来，计算基础隐喻资源中的主语、宾语同各类别中成员之间的相似度，依据相似度的大小，输出相应的类别作为候选的隐喻表达。本文的研究方法不仅降低了隐喻识别过程中主语和谓语、谓语和宾语之间的噪音，而且相似度聚类算法弥补了统计方法的缺陷。实验结果表明使用本文方法识别隐喻的准确率和召回率高于单纯使用统计方法。﹀
外文摘要：	︿ Automatic identification & interpretation of non-literal language is the major task in the computational linguistics. It exercises great importance over machine translation (MT), information retrieval (IR), artificial intelligence (AI), question & answer system (QA) and the other domains. Based upon the English corpus provided by the China Center for Information Development (henceforth, CCID), this paper conducts the following researches on non-literal English verbs. (1) Utilize the syntactic parser of Stanford University to parse the sentences in the English corpus provided by CCID, in order to get the subject, predicate and object.The syntactic parser of Stanford University is used to get the subject, predicate and object in the sentences and their dependencies. This paper explores the collocation phenomenon between subject and predicate, predicate and object, taking the predicate as the core of this research. （2）Harness the statistical approaches to compute the conceptual distance between subject and predicate, predicate and object, and output the candidate non-literal sentences according to the given threshold of the conceptual distance.This paper takes the n-gram and log likelihood approaches to compute the conceptual distance between subject and predicate, predicate and object; searches the non-literal phenomenon according to the threshold of the conceptual distance; add-one-smoothing approach is employed when data sparseness problem is encountered. Furthermore, the hyponyms and hypernyms of subject or object are considered when computing the conceptual distance between subject and predicate, predicate and object, thus improving the recall and precision of the metaphor identification. （3）Use the existing metaphor resources to compute the similarity between words; use the K-means clustering technique to improve metaphor identification precision and recall. This paper uses Master Metaphor List，TroFi Example Base as the base metaphoric resources. The syntactic parser of Stanford University is used to parse sentences in the base metaphoric resources, in order to get the subject, predicate and object of these resources. Then, the K-means technique is introduced to partition the subjects and objects in the CCID corpus into different clusters. Finally, this paper computes the similarity between the words of the base metaphoric resources and the CCID resources, and outputs the candidate non-literal sentences according to the given similarity threshold. The approaches taken in this paper reduced the noise between the subject and predicate, predicate and object during the process of metaphor identification. Meanwhile, the clustering technique and the statistical approach played a complementary role in identifying verb metaphors. Experimental results show that the precision and recall achieved in this paper are higher than the pure statistical methods. ﹀
分类号：	H087
论文总页数：	65
参考文献总数：	25
馆藏号：	017/M2010(333)
公开日期：	2010-05-28

2010-05-27

基于语境理论的移动英语辅助.侯晓琛

链接

题名：	基于语境理论的移动英语辅助
姓名：	侯晓琛
学号：	10717138
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
论文答辩日期：	2010-05-27
关键词：	语言语境移动学习易读性记忆提醒
论文摘要：	︿篇章语言学认为，任何语篇都在一定语境条件下产生，产生语篇的语境集合中的每一成分都分别影响着语言结构的选择和解释。因此语境理论对英语学习的指导作用不可估量，它有助于对语言的立体认识 (唐祥金 2001) 。通过语境可以使学习者看到词汇的搭配和具体用法，并理解同一词语在不同的句法结构中不同的含义，以及不同的词类出现在句中有着不同的功用。由篇章构成的语境包含的内容把篇章中的词语联系起来，可以使学习者更容易从他们的心理词汇中提取词汇，有利于词汇的记忆 (余德敏 2002年)。所以本文基于语境理论对英语词汇学习的指导意义，通过对英语学习者的词汇学习现状和需求做调查后，设计移动英语辅助学习系统。结合易读性研究，设计适合中国英语学习者的文章筛选模块，为学习需求者提供适合个人水平阅读的英语文章；系统设计在学习者阅读文章时，遇到陌生单词可即刻点击查看释义、同反义词、词汇搭配及相关例句，并为学习者制定科学的词汇记忆策略，通过短信的方式为其复现学习内容并提醒复习，为随时随地辅助英语词汇学习提供便利。本文首先研究语境及移动学习的基本理论，在研究基础上建构基于语境的词汇学习模式。其次，通过调查问卷对英语词汇教学现状进行分析，并得到学习者对移动英语辅助学习系统的需求数据。然后，介绍本项目的具体设计思路，包括文章筛选和复习提醒模块的具体研究设计，及对客户端和服务器端的功能模块和接口的详细介绍。最后，模拟系统设计的部分流程对学习者学习效率进行实地测试，验证系统设计的意义性。﹀
外文摘要：	︿ Text Linguistics proposes that any discourse is produced in a certain context, and each component in the context for generating a collection of texts plays a part in generating language structure and interpretation. A word has various connections with other words in terms of sound, form and meaning. Learning a word cannot be learning the word itself out of its connections. In a context, learners can understand words better and master them easier. Only a word is in a specific collocation, can learners grasp its usage, because meanings of a word may change in different settings. And learners can also comprehend sentence structure better in a context. Based on the research of Context theories and the investigation of the current situation of English words teaching and the needs of English learners, a system for assisting English learning is designed to provide proper articles for learners to read and learn words in a context and offer related collocations, sample sentences with targeted words, which makes learners remember words in their long memory easier and retrieve them faster. This system is built on a mobile platform which provides a reminding mechanism for reappearing and reviewing targeted words through SMS,lays a basis for hand-carry learning anytime and anywhere. ﹀
分类号：	G791/H0
论文总页数：	70
参考文献总数：	24
馆藏号：	017/M2010(075)
公开日期：	2010-05-27

科技翻译项目的风险管理研究初探——以F汽车维修手册文档的翻译项目为例.王水平

链接

题名：	科技翻译项目的风险管理研究初探——以F汽车维修手册文档的翻译项目为例
姓名：	王水平
学号：	10817364
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-27
关键词：	科技翻译项目项目管理风险管理
外文关键词：	Sci-tech Translation Project Project Management Risk Management
论文摘要：	︿风险管理是项目管理的重要领域知识之一，已经广泛应用于金融领域和工程项目中，但是在翻译项目管理中还属于空白。随着科技的迅猛发展和经济全球化的进一步深入，科技翻译已经逐步转向商业化、市场化、产业化。纵观当前翻译产业，商业翻译基本以科技翻译为主，其基本目标就是保证译文质量并按时交付译稿。现在，商业科技翻译的趋势是任务重、时间紧、专业性强，需要多人合作完成并同时达到客户要求的专业水准。这不能简单机械地分工进行翻译然后将译文合并，而需要引入项目管理的理念和方法，将翻译看作是一种项目活动进行管理。良好的风险管理非常有助于项目取得成功，否则项目失败的概率就会大大增加。本文基于风险管理的基础理论方法，结合本文作者在读研期间参与或负责的多个科技翻译项目实践经历，对科技翻译项目实施过程中的风险进行了分析，构建了科技翻译项目的风险管理框架，并以本文作者组织实施的F汽车用户维修手册文档的翻译项目实践经验为例，对风险管理在科技翻译项目中的应用进行了探索性研究。本文正文共分为六章。第一章介绍了本文的选题背景及意义、问题的提出过程、相关研究的现状以及本文的研究方法、内容、目标与组织结构。第二章综述了风险管理的相关理论，并讨论了在翻译项目管理中引入风险管理对翻译项目管理和质量控制的重要意义。第三章比较分析了科技翻译项目的实施流程，提出了作者认为更为合理的基于项目管理知识科技翻译项目实施流程。第四章是本文的重点，详细分析了科技翻译项目实施过程中的风险因素，提出了科技翻译项目的风险管理框架。第五章介绍了F翻译项目及其实施过程，详细论述了风险管理方法在F翻译项目中的应用过程，并通过项目结果验证了风险管理方法在科技翻译项目管理中的积极效果。第六章为总结与展望。﹀
外文摘要：	︿ Risk management, one of the knowledge areas of project management which has already been applied into financial and engineering projects, is not yet employed in the management of translation projects. With the rapid development of science and technology and the deepening of economic globalization, sci-tech translation is gradually turning to commercialization, marketization and industrialization. Across the modern translation industry, we can find that translation business mainly consists of sci-tech translation with ever-decreasing literary translation. The fundamental objectives of commercial sci-tech translation are assured quality and due schedule, while nowadays translation projects show a trend of more work but less time. Thus they can be finished only by means of cooperation employing the theories and methods of project management, instead of simply dividing the project into parts for translation and combining the translations mechanically. With good risk management, a project is likely to succeed, or the chances of failure would be greatly increased.This paper is based on the primary theories and methods of risk management and analyzes the risks during the execution flow of sci-tech translation projects by combining the practices of different sci-tech translation projects taken organized by the author. Then the paper constructs a risk management framework for sci-tech translation projects and takes the F auto maintenance manual as an example to explore the application of risk management into sci-tech translation projects.This paper falls into six chapters. The first chapter is an introduction to this paper, presenting the research background and meaning of this paper, statement of the subject, current status of related researches, the proposal and the main content of the study as well as the structure of this paper. The second chapter begins with an introduction to the basic concepts and theories of risk management, followed by a research summary of the importance of risk management in project management. The third chapter analyzes and discusses the execution flows of sci-tech translation projects, coming up with a translation project management model based on project management knowledge. Further in the fourth chapter, the paper probes into the risks underlying in sci-tech translation projects, analyzing and assessing these risks as well as proposing responses and solutions to them. On the basis of the above discussion and study and combining the sci-tech translation project management process and the theories and methods of risk management, this paper concludes a management framework for sci-tech translation project management, namely the “One Process with Two Orientations”model. In the fifth chapter, by taking a translation project of an auto maintenance manual as an example, the paper illustrates the application and the performance of risk management theories and methods in sci-tech translation projects. The last chapter is the conclusion and prospect. ﹀
分类号：	H059
论文总页数：	80
参考文献总数：	67
馆藏号：	017/M2010(318)
公开日期：	2010-05-27

英文技术文档语域分析及写作项目研究.刘雪姣

链接

题名：	英文技术文档语域分析及写作项目研究
姓名：	刘雪姣
学号：	10817251
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
导师2姓名：	俞敬松
论文答辩日期：	2010-05-27
关键词：	技术文档系统功能语言学语域词汇密度平均句长及物性主位推进模式
论文摘要：	︿英文技术文档在遵循科技英语的一般规律外，同其他科技散文等科技英语文章相比仍有较大差别。在目前的研究中，英文技术文档常被笼统地归结于科技英语，混淆了技术文档与科技散文体的差异性。对技术文档语域特征的分析有助于文档写作中语言的选择、写作流程的制定、对所实现的功能进行有效的理解，并最终对技术文档质量考核提供帮助。本文以IT行业英文技术文档研究对象，以源自语言学家韩礼德（M.A.K. Halliday）系统功能语法理论中的语域理论为基础，旨在探讨语域理论对技术文档写作的制约作用。在对技术文档语域特征进行定性分析同时，本文引入了定量分析，对技术文档与科技散文体，技术文档书面体与技术文档口语体的词汇密度及平均句长进行了对比分析，并且统计分析了英文技术文档的及物性结构和主位推进模式。另外本文在语域特征分析之后，又例举了技术文档写作的实际工程案例，以此来展示在实际工作中，如何使用写作策略来满足英文技术文档的语域对其写作的要求。本文的写作策略的提出并非是Style Guide中内容的重复，而是针对技术文档与其它文体在语域特征上的差异而提出的对策，并且是在实际项目中验证的切实可行的解决方案。本文发现，作为科技英语大语域下的独立小语域，英文技术文档有着不同于科技散文体的语域特征，分别体现在词汇密度、平均句长、时态、语态等方面。另外，关于技术文档及物性研究的结果表明，在英文技术文档中，物质过程占了绝大比例，而关于主位推进模式的研究表明了在英文技术文档中，主位同一型占了主要部分。本文为技术文档写作工程师及文档翻译人员提供参考，促进技术文档写作的研究和发展。﹀
外文摘要：	︿ Technical document is a sub-category of English for Science and Technology(EST), and its main purpose is to provide guidance, notice or advice to users with plain English. However, Technical document still differs with EST in many aspects. The analysis of register will help manage the writing procedure plan, word choice and quality assurance, and it will help the writer have a better understanding of what he/she is about to write. Based on Halliday’s Theory of Register, this thesis studies the field, tenor and mode of technical documents in IT field, which influence the lexical and sentential feature of technical writing.The thesis is a qualitative study with data proof as a whole. The comparison of lexical density and average sentence length between technical documents and EST, written texts and oral ones are made respectively. Besides, the transitivity and patterns of thematic progression of technical documents are analyzed. After the register analysis, a real technical writing project is introduced. The thesis tries to avoid the contents that have mentioned in Style Guide, but to put forward methods that aim at fitting the register feature of technical documents.Compared with technical essay, technical documents have different features in lexical density, average sentence length, voice and tense. Also, in the study of transitivity, it is found that material process happens more frequently than others. The study result of patterns of thematic progression shows that the thematic consistence pattern holds a large proportion. The research findings could be a reference for technical writer and translators, and hope it could also be a good beginning to the study of technical writing and contribute to the research of technical writing in other fields. ﹀
分类号：	H0
论文总页数：	63
参考文献总数：	24
馆藏号：	017/M2010(291)
公开日期：	2010-05-27

基于单语语料库的翻译实务研究—Sketch Engine工具在翻译实践中的运用.王一一

链接

题名：	基于单语语料库的翻译实务研究—Sketch Engine工具在翻译实践中的运用
姓名：	王一一
学号：	10817376
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-27
关键词：	语料库翻译实践 Sketch Engine
外文关键词：	corpus translation practice Sketch Engine
论文摘要：	︿自20世纪90年代以来，Mona Baker，Sara Laviosa，Kiresten Malmkjaer等翻译研究者开始将语料库运用于翻译研究，使基于语料库的研究方法在翻译理论研究领域得以迅速地普及和发展，并带来了翻译研究范式的根本变革。二十年来，诸多学者利用语料库在翻译理论以及翻译教学等方向进行研究，并取得较好效果。但对于使用语料库，特别是使用单语语料库辅助翻译实践的研究却数量甚微。从使用语料库进行翻译理论研究的经验中可以发现，无论是语料库还是语料库检索软件，均可作为计算机辅助翻译工具的一种，应用于翻译实践，辅助提高译文质量。因此，本文将讨论单语语料库运用于翻译实践的理论基础和可行性，借助于Sketch Engine检索工具，通过使用现有单语语料库和自建单语语料库的方式，在具体的翻译实例中讨论单语语料库对于翻译实践的重要作用。本文的研究目的是分析单语语料库运用于翻译实践的可能性，探索单语语料库在翻译实践中的应用。本文以Word Sketch检索为基础，以词汇翻译为切入点，以目标语单语语料库为研究对象，在具体的翻译实例中，考察语料库的实际应用。本文的研究意义是以单语语料库为纽带，将翻译理论研究与翻译实践研究相结合，对目标语单语语料库在翻译实践中的应用做了新的探索。将Sketch Engine工具引入翻译的实践操作中，为基于语料库的翻译实践提供了新的研究方式。﹀
外文摘要：	︿ Since 1990s, translation scholars like Mona Baker, Sara Laviosa, Kiresten Malmkjaer started to apply corpus to translation studies, which brought on the emergence and development of corpus-based approach in translation theory studies and brought about the essentially change of research paradigm in this field. During the recent 20 years, many scholars adopted corpus to study translation theory, explore translation pedagogy, etc. which could be said corpus-based translation studies has got initial success in some way, while, limited effort was put on the corpus-based translation practice studies, especially on monolingual corpus. Learning from corpus-based translation theory studies, it is not difficult to discover that both corpus and corpus access software can be regard as a kind of computer-aided translation tools which could be applied in translation practice to improve the quality of translation product. Therefore, this thesis will discuss the theoretical basis and feasibility of adopting monolingual corpus in translation practice. By using Sketch Engine tool to access existed monolingual corpus and build up target language corpus, the importance of monolingual corpus in translation practice will be analyzed in specific translation cases as well. This thesis aims to analyze the feasibility and application of adopting monolingual corpus in translation practice. Basing on Word Sketch, breaking through lexical study and focusing on monolingual corpus, this thesis investigates the practical application of corpus in translation. Therefore, the significance of this thesis is by taking monolingual corpus as a link, integrating translation theory studies with translation practice, doing a completely new exploration on using monolingual corpus in translation practice, and providing a new research method for corpus-based translation practice by applying Sketch Engine tools. ﹀
分类号：	H087
论文总页数：	66
参考文献总数：	48
馆藏号：	017/M2010(322)
公开日期：	2010-05-27

基于系统功能语言学的中英翻译质量评估模式研究.林曦

链接

题名：	基于系统功能语言学的中英翻译质量评估模式研究
姓名：	林曦
学号：	10717196
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-27
关键词：	翻译翻译批评功能语言学语篇分析翻译质量评估
外文关键词：	translation translation criticism funtional linguistics textual analysis translation quality assessment
论文摘要：	︿翻译是各种文化之间联系交流的重要桥梁。而随着国际间交流的日益频繁，翻译质量已经成为翻译领域倍受关注的话题。本文基于系统功能语言学的理论，围绕翻译批评的核心翻译评估来对翻译质量进行研究。首先，在大部分的交际过程中，主要呈现的是语篇形式。以语篇为载体，先从“自下而上”的层面入手，从形式，功能和情景三者相互之间的联系和影响在小句上的反映，从反映情景语境的语域(register)，即语场(field)，语旨(tenor)和语式(mode)对小句的三种意义，即概念意义，人际意义和语篇意义的直接作用，通过比较两个预设定的参数：语义（概念）和语篇（人际），来判断译文对原文是否发生了“偏离”，是哪里发生了“偏离”，此类“偏离”是属于正“偏离”还是负“偏离”。第二步，立足于整个语篇的宏观角度，以语篇类型学和功能语言学为指导，对第一步在微观层面所涉及到的“偏离”个案进行梳理和价值判断；接着，根据第二步所做出的梳理和判断，重新定位和取舍第一步中发现的“偏离”并进行统计，对“偏离”（概念、人际）设定相适宜的权重；最后，根据权重在定量分析后所得出来的“偏离”值，参照加拿大政府翻译局的语言质量衡量系统Sical对译文质量进行合理的评估，来判断在多大程度上与原文是“对等”的，并由此来区分译文的等级。最后，本文采用实例分析，通过理论和实践，总结出中英翻译质量评估模式，完成了一个较为系统的研究。﹀
分类号：	H0/TP311.52
论文总页数：	59
参考文献总数：	41
馆藏号：	017/M2010(101)
公开日期：	2010-05-27

基于SNS的用户协作型英汉翻译词典社区的研究与设计.李博

链接

题名：	基于SNS的用户协作型英汉翻译词典社区的研究与设计
姓名：	李博
学号：	10717168
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师2姓名：	俞敬松
导师2单位：	软件与微电子学院
论文答辩日期：	2010-05-27
关键词：	翻译词典社交网站虚拟社区 SNS应用开放平台
外文关键词：	Translation Dictionary Social Networking Site Virtual Community SNS Application Open Platform
论文摘要：	︿随着Web2.0时代的到来，网络信息服务更加注重以用户为中心的原则，其中最典型的代表SNS社交网站在国内发展迅猛，受众越来越广，影响越来越大，深入人心。人们的线上生活越来越重要，对线上应用的需求也越来越旺盛。目前， SNS上的各种应用如雨后春笋般出现，并逐步实现系统开发，涉及游戏、交友、知识分享等众多领域，但SNS上的用户协作型英汉翻译词典应用却凤毛麟角，发展空间很大。本文围绕着翻译词典和SNS展开，分为理论研究、需求分析、系统设计与实现三部分。首先，从翻译词典的定义、释义理论与方式、词典例证翻译的等值原则三个方面分别阐述，结合虚拟社区的特点概括出翻译词典社区的组成要素，再对SNS进行简要概述，根据翻译词典社区与SNS的结合点，从技术整合可行性、翻译模式变革性和传播模式优质性三个方面对两者的整合应用进行适应性研究；其次，以英汉两种语言作为尝试，对比分析了英汉翻译词典与英语学习词典、几种纸质翻译词典和几种在线翻译词典的优势与不足，结合翻译实践和翻译社区的组成要素提炼出译者的翻译词典社区需求并进行系统需求分析；接着，根据需求分析提出系统的解决方案，并对系统在功能模块、概念模型、动态流程和数据库等多方面进行了总体设计与详细设计；最后的系统实现，采用PHP、HTML和JavaScript作为开发语言，MySQL作为数据库，通过OpenSocial作为应用程序编程接口来实施，并在Orkut开放平台上进行测试。对系统的初步测试和调查结果表明，本文最终对英汉翻译词典社区的设计与实现能够满足用户使用翻译词典、协作词典编纂、互动翻译和在线测试等功能，希望此应用能够成为英汉翻译爱好者的线上辅助翻译工具之一。﹀
外文摘要：	︿ With the advent of Web2.0 era, network information services focus more on user-centered principles. Social networking site, which is the most typical representative of Web2.0, develops rapidly in China. It has gained recognition by its increasing users and increasing influence. People's online lives have become more and more important. Accordingly, the demand on the online applications has also become stronger and stronger. Currently, the SNS applications have mushroomed enormously, getting involved in many areas, like game, making friends and knowledge sharing. However, there is barely any collaborative translation dictionary application on SNS, which has great development space.This paper focuses on translation dictionaries and SNS and is consisted of three parts - theory research, requirements analysis and system design and realization. Firstly, the author does a theory study on English-Chinese translation dictionary from three aspects, which are the definition of translation dictionary, the theory and methods of translation, and the translation rules of examples, and summarizes the elements of translation dictionary community by combining the characteristics of virtual learning communities. Then, the author does a brief summarization about SNS and has an adaptability study about the integrated application of translation dictionary community and SNS from three aspects, which are the feasibility of technical cooperation, the transformation of translation model and the quality of communication, based on their combination points. Secondly, with English and Chinese languages as a starting point, the author compares and analyses the advantages and the disadvantages between English-Chinese translation dictionary and English learner's dictionary, among some paper translation dictionaries and some online translation dictionaries. The translation dictionary community needs of translators are found by combining the translation practice and the elements of the translation community, and the requirements of the system are analysed. Thirdly, the author gives the solutions to the system based on requirements analysis, and does overall design and detailed design of function modules, concept models, dynamic process models and database tables. Finally, the realization of this SNS-based English-Chinese translation dictionary community is completed by PHP, HTML and JavaScript development languages, MySQL database, and the author deploys the system by OpenSocial API and tests it on Orkut open platform.The ultimate design and realization proves the English-Chinese translation dictionary community in this paper can meet users’ requirements of using translation dictionary, cooperatively dictionary compilation, interactive translation and online test. It could likely to be a new emerging online translation tools for English-Chinese translation lovers and learners. ﹀
分类号：	TP393.01/G206.3
论文总页数：	71
参考文献总数：	31
馆藏号：	017/M2010(088)
公开日期：	2010-05-27

2010-05-19

面向概率型词汇知识库建设的名词语言知识获取.王萌

链接

题名：	面向概率型词汇知识库建设的名词语言知识获取
姓名：	王萌
学号：	10648885
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	黄居仁
导师2单位：	香港理工大学
论文答辩日期：	2010-05-19
外文题名：	Linguistic Knowledge Acquisition of Noun for the Construction of Probablistic Lexical Knowledge-base
关键词：	词汇获取名词数名结构量名搭配复合名词短语概率语法属性
外文关键词：	language knowledge-base lexical acquisition noun numeral-noun construction classifier noun collocation noun compound probabilistic grammatical characteristics
论文摘要：	︿语言知识库是众多自然语言处理系统不可或缺的组成部分，同时也是各种自然语言处理技术赖以实现的基础。随着语料库方法和统计语言模型在自然语言处理领域的广泛运用，大规模语言知识的开发和自动获取成为目前自然语言处理技术的瓶颈问题。语言知识库建设已经成为自然语言处理领域最基本、最重要的应用基础研究之一。北京大学计算语言学研究所在语言知识库的建设方面积累颇丰，在相同的语法理论体系指导下，开发了一系列以汉语为核心的包含词法、句法和语义等信息的语言知识库，总称为“综合型语言知识库”。本文的研究是以综合型语言知识库为基础，围绕异质资源的集成创新这一主题，从资源集成的“广度”和“深度”两个方向展开研究，主要工作包括：第一，从资源集成的“广度”上，主要探索异质数据资源集成的方法，将结构和表现形式各不相同的语言知识库纳入同一个软件平台，建设“综合型语言知识库系统”，在最大程度上挖掘和发挥资源集成的优势，实现信息服务向知识服务的转型，为自然语言处理研究、语言本体研究及语言教学研究提供全方位、多层次的支持。在这一层次上，本文主要侧重于软件系统的功能设计和开发，完成了综合型语言知识库系统主体功能模块的建设。第二，从资源集成的“深度”上，将结构化知识（词典知识）与非结构化知识（语料库）相融合，研究词语语法属性的概率化描述方法，构建新的语言资源《概率型现代汉语常用词汇知识库》，作为集成创新的成果。本文选择名词为切入点，研究从语料中自动获取名词语法属性的方法，内容涉及数词与名词构成的“数名结构”，数词、量词与名词构成的“数量名短语”以及名词与名词构成的“复合名词短语”，并对这三种属性关系进行了详细的句法和语义分析。本文在这一层次上主要侧重于研究方法的探索，其范围涉及到自然语言处理领域多方面的内容，创新点包括：1. 提出了新的统计量“分散度”，用来区分数词与名词组成的“数名”结构是固定搭配还是自由短语。该统计量对于其它问题，如量词的分类等也具有借鉴意义。2. 设计并实现了复杂数量名短语的识别算法，实验结果表明，该方法可以有效地识别这一类存在语义约束的名词短语。本文将该方法应用到大规模的语料库上，从而得到真实的量名搭配分布。3. 基于量名搭配的统计数据，本文首次采用基于信息论和知识的计算模型，定量地分析了量词对名词的语义选择限制。此外，本文提出了基于量词的名词概念描述方法，研究了量词在名词语义分类中的作用。这些计量研究的成果为量词的定性研究和分析提供了补充和佐证。4．针对统计指标不能有效获取低频复合名词短语的问题，本文提出了新的解决方法，将其视作一个分类问题，利用统计指标获取典型的、高频的复合名词短语作为训练数据，来帮助发现低频的复合名词短语，实验结果说明该思路是有效的。 5. 对于汉语复合名词短语的语义解释，本文首次采用动态的策略，提出了“基于动词的释义短语”的方法，对复合名词短语进行语义解释，该方法不仅可以为复合名词短语提供多种可能的语义解释，而且能够反应相似的复合名词短语之间细微的语义差别。综合型语言知识库系统既是本文的研究基础，又是本文的研究目标。作者在资源集成两个层次上的研究工作，不仅为后续工作提供软件支持，也为其它词类的语法属性之计量研究提供方法上的借鉴。﹀
外文摘要：	︿ Language knowledge-base is an indispensible component for many natural language processing (NLP) systems as well as the foundation for many NLP technologies. The construction of language knowledge-base has become one of the most fundamental and crucial research fields in NLP. Among the various methods and language models in NLP, corpus-based methods and statistical language models are used extensively. The automatic acquisition of language knowledge, however, has become a challenge for NLP technologies.The current study, based on Comprehensive Language Knowledge-base (CLKB), discusses language resource integration. CLKB is a collection of language resources accumulated by Institute of Computational Linguistics in Peking University. CLKB provides knowledge on various linguistic aspects of Chinese language, including Chinese lexicon, Chinese syntax and Chinese semantics. Specifically, this thesis focuses on the perspective of the breadth and depth of resource integration, and consists of two main parts of work. First, in terms of the breadth of resource integration, we explore the methods of integrating the heterogeneous resources with different structures and representation. The goal is to build the CLKB system so that it could accommodate all resources and to extend the resource integration to the maximum. Eventually, we will enhance CLKB system so that it will serve as a knowledge service system, which can provide comprehensive, multi-level support to many fields such as NLP, theoretical language research and language teaching research. Thus from the perspective of breadth, this part of work focuses primarily on the functional design and development of the CLKB system.Second, in terms of the depth of resource integration, we combine the structured knowledge from the dictionary and the un-structured knowledge from the corpus to produce a new language resource as the result of resource integration. This part of study concentrates on the automatic acquisition of noun’s grammatical characteristics from corpus. We conduct detailed syntactic and semantic analysis on these linguistic constituents: the “Numeral-Noun” constructions, classifier noun phrases and noun compounds. From the perspective of depth, this part of work involves the application of a wider range of tasks in NLP and emphasizes the exploration of methodology, and consisits of the following innovations:1. We propose a new statistical score “Distribution Degree”, which can be adapted to distinguish between the fixed collocations and phrases formed by “Numeral-Noun”. This statistical score can serve as further references to other similar problems, such as the classification of Chinese Classifiers.2. We implement an algorithm that automatically identifies the classifier noun phrases from the corpus. Our experimental results show it is effective in the recognition of such noun phrases with semantic restrictions. Moreover this algorithm has been applied in a large-scale corpus to acquire the real distribution of classifier-noun collocations. 3. Based on the statistical distribution of classifier-noun collocations, we conduct, for the first time in relevant research literature, a quantitative analysis on the selectional preference between classifiers and nouns using information theory model.In addition, we propose a representation for noun concept using classifiers, and analyze the contribution of classifiers in the classification of nouns. The results from the above quantitative research provide useful supplements and justification for the qualitative research on Chinese classifiers.4. We propose a new method in the acquisition of infrequent Chinese noun compounds, since many statistical scores cannot be applied to solve the problem of acquiring infrequent Chinese noun compounds reliably. In this new method, it is modeled as a two-class classification problem, and it makes use of the statistical scores to get the typical, frequent noun compounds as training data. Then a classifier is trained to discover the infrequent ones. The experimental results show that this strategy is applicable.5. We present a dynamic method by using verbal paraphrases to interpret the meaning of Chinese noun compounds for the first time in the literature. This method not only provides the possible interpretations for one noun compound, but also reflects the subtle semantic differences of similar noun compounds.The CLKB system is the basis of the current research as well as the goal of this research. From the perspective of the breadth and depth of resource integration, the work accomplished in this paper has not only provided the software supporting future researches, but also shed lights on the research methodology for quantitative studies of other word classes. ﹀
分类号：	H087
论文总页数：	119
参考文献总数：	146
馆藏号：	048/D2010(62)
公开日期：	2010-05-19

2009-06-08

英汉中动结构推导的句法语义界面研究.王珺

链接

题名：	英汉中动结构推导的句法语义界面研究
姓名：	王珺
学号：	10639026
论文语种：	chi
专业：	英语语言文学
公开时间：	公开
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2009-06-08
外文题名：	Derivation of English and Chinese Middle Constructions—On the Syntax-semantic Interface
关键词：	中动结构论元结构句法语义界面轻动词
外文关键词：	middle construction argument structure syntax-semantic interface light verb
论文摘要：	︿中动结构(Middle Constructions)虽然具有非常简单的“NP＋VP＋Adv”的表层结构，但是在句法领域却吸引了广泛的研究。在二十世纪九十年代之前，生成语法学界中对中动结构的研究主要分为两类：词汇生成和句法生成。随着最简方案（MP）的引进，研究者基于轻动词理论，提出了一系列改进的模型。本文旨在解决有关中动推导的三个核心难题：(1) 中动词论元结构和一般动词不同的机制和原因是什么？(2) 为什么在中动结构中修饰语是必须的？(3) 中动结构形成的语义制约是什么？通过重新考虑中动句NP的句法和语义角色以及NP与谓词之间的关系，本文认为，中动句的NP并非生成在[Comp,VP]位置，而是在一个话语中确定的谈论对象，其与谓词的关系应当是“关于”(aboutness)。另一方面，中动词经过一个“控制－提升”的转换，其论元指派能力丧失。在此基础上我们给出了中动结构的句法推导图。由于句法推导模型仍然无法避免生成不合语法的中动句，本文还探讨了中动结构生成中的语义制约。我们认为NP必须是原来动词论元结构中的某个论元，并且构拟了论元进入中动结构的层阶。此外，虽然中动词不同程度失去了原先的语义内容，但是在部分中动结构的语义解释中仍然起到一定的作用，我们提出NP和VP之间的语义关系是完成中动结构语义解读的一个重要部分。﹀
外文摘要：	︿ The middle constructions(MC), despite its simplistic ‘NP+VP+Adv’ surface structure, has been richly studied in the field of syntax. Prior to the 1990’s, such research within generative circle fell into two categories: lexical generation and syntactic generation. With the introduction of the Minimalist Program(MP), researchers put forward several revised models, mainly embracing the light verb theory. This thesis aims at addressing three core and most inviting dilemmas of MC derivation: (1) How and why the middle verb carries a different argument structure from an ordinary verb? (2) Why is the adverbial modifier obligatory in an MC? (3) What are the semantic constraints on the formation of an MC?On the basis of a re-evaluation of NP’s syntactic and semantic role, as well as its relation to the predicate of an MC, this thesis argues that the NP is not base-generated at the [Comp,VP] position. Instead, the NP is an established target in discourse to be ‘talked about’, with its relation to the predicate being ‘aboutness’. On the other hand, the middle verb undergoes a ‘control-raising’ transformation in which the verb is deprived of its thematic role assigning ability. It is on this ground that the tree diagram of the MC derivation is achieved.As the syntactic derivation model does not exclude ungrammatical MCs, one chapter will be dedicated to a discussion on the semantic constraints on MC generation. We suggest that the NP is supposed to be a certain argument within the argument structure of the verb before transformation. We elaborate a hierarchy as for what argument is more favored in the MC generation. Besides, although the middle verb loses semantic contents to a certain degree, it still plays a role in semantic interpretation of some MCs. We suggest that the semantic relation between NP and VP is an essential part in fulfilling the interpretability of an MC. ﹀
分类号：	H314/H14
论文总页数：	78
参考文献总数：	66
馆藏号：	039/M2009(21)
公开日期：	2009-06-08

基于相似文本检测的反恶意文本系统.罗侃

链接

题名：	基于相似文本检测的反恶意文本系统
姓名：	罗侃
学号：	10617292
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2009-06-08
关键词：	相邻文本检测 All-Pairs算法 Locality Sensitive Hash 反恶意文本
外文关键词：	Near duplicate detection All-Pairs algorithms Locality Sensitive Hash Anti-spam
论文摘要：	︿随着互联网的飞速发展，互联网上的文本数量也以几何级数的速度膨胀。随着越来越多的用户习惯于使用甚至依赖互联网上的文本信息，例如电子邮件、论坛、个人空间等，大量的恶意文本开始在互联网上广泛地传播。有统计表明，恶意文本传播的规模近年来非但没有被抑制，反而有愈演愈烈的趋势。由于恶意文本的制造者会改变文本的表达方式（例如在两字之间加入空格或符号，或是竖排的书写方式等），传统的基于规则过滤的方法以及机器学习的方法往往不能非常有效的遏制恶意文本的传播。针对绝大多数恶意文本是通过程序自动的、大批量的、文本内容极其近似的方式发布至论坛以及个人空间等领域，本文提出使用基于相似文本检测技术的反恶意文本系统。该系统在离线的状态下从海量的文本中找出内容相近的文本。通过人工的方式标注这些文本并作为训练数据。通过这些训练数据，系统将得到恶意文本自动识别的分类器。经过实验证明，通过该方式标注出的恶意文本所训练出的分类器能够有效的找出海量文本中的恶意文本。和随机抽取文本并标注得到训练数据的方式相比，该方式具有如下两个优点：1）能有效的减少人工标注的工作量；2）能找出新的恶意文本的形式。此外，通过在线的近似文本查询算法，系统可以检测出大量发布近似文本的行为，继而能够有效的遏制恶意的文本的大规模传播。﹀
外文摘要：	︿ As internet grows rapidly, lots of digital documents are generated and delivered to users in exponential way. Many users have been used to leverage the information from internet such as Email, forum or personal space. However, many spam documents also delivered to users through these media. This trend was also been proved by scientific research report. Due to the reason that spammer would probably change the style for expression, rule based method and the machine learning based method cannot stop the spreading of spam efficiently. When focus on email spam or other similar scenario, most of spam document are automatically delivered to users in a huge number and their content are near duplicate or even the same. By employ the technique to detect near duplicate document, we design a system for anti-spam task. The system will first find out the near duplicate documents from huge number of texts in offline mode. By labeling these data with human efforts, the system could build a classifier for spam detection in machine learning method. Experiment results shows that this method could effectively find out spam with much less human resource, and the classifier could also effectively detect spam document. Furthermore, such method could find out the new form of spam.Besides, with the online near-duplicate detection algorithm, system would block spam spreading efficiently. ﹀
分类号：	TP393.08
论文总页数：	70
参考文献总数：	49
馆藏号：	017/M2009(129)
公开日期：	2009-06-08

基于条件随机场的中文地名和机构名识别.谭大伟

链接

题名：	基于条件随机场的中文地名和机构名识别
姓名：	谭大伟
学号：	10617339
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
论文答辩日期：	2009-06-08
论文摘要：	︿命名实体识别属于自然语言处理的基础研究领域，它是信息抽取、信息检索、机器翻译、组块分析、深层句法分析、语义分析、问答系统等多种自然语言处理技术的重要基础。因此，命名实体识别研究对自然语言处理和文本信息处理具有极其重要的意义。本文针对现代汉语文本的特点，主要研究以地名和机构名识别为核心内容的中文命名实体识别问题，以条件随机场为基本框架，设计并实现了一个中文地名和机构名识别系统。本文的大体框架如下：首先，介绍了命名实体识别的研究背景、基本概念和识别方法，重点介绍了中文命名实体识别的特点与研究现状。然后，详细介绍了最大熵、支持向量机、条件随机场三种统计模型，并从理论依据和实验结果出发比较了它们用于中文命名实体识别的优劣。接着，在使用条件随机场模型的基础上比较了基于分词和不基于分词、逐一识别和统一识别的识别策略对中文命名实体识别效果的影响，并通过实验选取了有效特征集合。最后，采用基于分词的统一识别策略，实现了一个基于条件随机场模型的中文地名和机构名识别系统，测试结果表明该系统能够获得较为满意的识别效果和性能。﹀
分类号：	TP391/H087
论文总页数：	75
参考文献总数：	0
馆藏号：	017/M2009(152)
公开日期：	2009-06-08

基于动态条件随机场的中文命名实体识别.张友书

链接

题名：	基于动态条件随机场的中文命名实体识别
姓名：	张友书
学号：	10617471
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	北京大学软件与微电子学院
论文答辩日期：	2009-06-08
外文题名：	Dynamic conditional random fields based chinese named entity recognition
关键词：	动态条件随机场中文命名实体识别
外文关键词：	Dynamic conditional random fields chinese named entity recognition
论文摘要：	︿命名实体识别就是把文本中出现的命名实体包括人名、地名、组织机构名、日期、时间、和其他实体识别出来并加以归类。命名实体识别属于自然语言处理的基础研究领域，是信息抽取、信息检索、机器翻译、组块分析、问答系统等多种自然语言处理技术的重要基础，对命名实体识别进行研究具有很大的实用意义。本文主要研究以人名、地名和组织名识别为主的中文命名实体识别问题。具体说来，本文的主要内容如下：本文首先介绍了命名实体识别的定义及其特点，并简要介绍了国内外的中文命名实体识别研究情况，以及现有的主要命名实体识别方法。目前，中文命名实体识别主要采用统计机器学习的方法，而其中条件随机场的效果最好。所以紧接着详细介绍了链式和动态条件随机场的定义、模型表示、参数估计和训练方法等。进一步地，将链式和动态条件随机场模型应用于中文命名实体识别任务，提出了三种基于条件随机场的命名实体识别方法：使用链式条件随机场基于字的方法、使用链式条件随机场基于词的方法、基于动态条件随机场的方法。最后，通过实验对这三种方法进行了比较分析。本文的主要贡献是：第一，首次尝试将动态条件随机场模型应用到中文命名实体识别中来。第二，利用动态条件随机场进行命名实体识别，将中文分词与命名实体识别过程融合在一起，使二者相互影响，能够同时改善中文分词和命名实体识别效果，改进了现有的命名实体识别技术。第三，基于动态条件随机场实现了一个命名实体识别系统。﹀
外文摘要：	︿ Named entity recognition (NER) is to identify and classify the named entities in the text, including person, organization and location names, time, date, and other entities. Named entity recognition is the one of the essential tasks of natural language processing research, which is the key steps of many natural language processing tasks, such as information extraction, information retrieval, machine translation, and question answering. In this paper, we concentrated on recognizing person names, location names and organization names in Chinese text. The main contents of this paper are as follows: First of all, we introduce the definition of named entity recognition and its characteristics. Then we give a briefing of the major named entity recognition methods. At present, Chinese named entity recognition (CNER) mainly use statistical machine learning methods, among which the methods based on conditional random fields (CRFs) is the best. Therefore, we introduce the theory of the linear-chain and dynamic conditional random fields. Furthermore, we present three methods: character based CNER method using CRFs, word based CNER method using CRFs, and CNER using Dynamic conditional random fields (DCRF). Finally, we analyze and compare the three methods through experiments. Our work is characterized as follows: firstly, we apply DCRF models to solve CNER problem for the first time. Secondly, our CNER method using DCRF can alleviate the influence of the error from the previous step Chinese word segmentation, and improve the performance of Chinese word segmentation and CNER at the same time. ﹀
分类号：	TP391.1
论文总页数：	53
参考文献总数：	55
馆藏号：	017/M2009(221)
公开日期：	2009-06-08

SVM和CRF结合下人的头部检测与精确定位.吴桐

链接

题名：	SVM和CRF结合下人的头部检测与精确定位
姓名：	吴桐
学号：	10617392
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2009-06-08
外文题名：	Human head detection and exact localization with SVM and CRF combination
关键词：	图像识别机器学习 SVM CRF SIFT SIFT_LIKE 级联分类器人的头部定位算法再验证规则
外文关键词：	Image Detection Machine Learning SVM CRF SIFT SIFT_LIKE Cascade Classification Method Human Head Localization Algorithm Post Verification Rules
论文摘要：	︿人的头部检测与定位技术在监控、跟踪系统，智能机器人、智能车载系统的研发中有很大的作用。本文主要研究单目镜拍摄下，静止图片中人的头部检测与精确定位的问题。本文以提高头部检测与定位的准确率和召回率、缩短检测定位时间为研究目标，在特征抽取、分类架构、头部定位算法、再验证规则、系统效率方面提出了一些新的思路和算法。在特征方面，本文提出了一个新的特征，叫“SIFT_LIKE”特征，它与SIFT相比，在图片分割方式、梯度信息计算等方面有所不同。在分类架构方面，本文提出了SVM和CRF级联的分类框架，并结合相应的特征来训练SVM、CRF分类器。在头部定位算法方面，本文采用了一种基于灰度变化的头部定位算法来实现头部的定位，同时通过制定一些规则进行后期再验证，来舍弃那些不符合规则但没有被分类器正确归类的非头部区域，以达到弥补分类器不足，提高检测与定位的准确率和召回率的目的。在效率方面，采用以下几点来缩短检测和定位的时间：（1）搜索策略方面：设定相关参数，控制搜索次数；（2）保存必要的中间计算结果，以减少重复计算；(3）尽量减少IO操作，充分利用内存资源。文本开发了一个人的头部检测与定位系统，在准确率、召回率和检测速度上取得了一定的效果。﹀
外文摘要：	︿ Human head detection and localization plays an important role in the development of surveillance and tracking systems, intelligent robots, intelligent driver assistance systems. This paper is on the study of human head detection and precise localization in monocular still images. This paper aims to improve the accuracy and recall rate and detection speed, and puts forward some new ideas and algorithms in the following aspects: feature abstraction, classification framework, human head localization algorithm, post-verification rules, system efficiency.This paper puts forward a new feature called “SIFT_LIKE”, which is based on SIFT but with some difference on block segmentation method, calculation of pixel’s gradient information and some other aspects.While on classification framework, this paper puts forward a two layer cascade method. The classifiers used are SVM and CRF. I try to pick up the proper features which can strengthen the specific advantages of each classifier. The localization algorithm in this paper is based on grey difference of the image’s columns(rows). Then I use some rules for post verification to discard the ones which disobey the rules but detected to be head regions by classifiers. So it can improve the accuracy and recall rate of the human head detection and localization.I adopt the following methods to improve the speed of detection and localization. The first is using parameters to control the search time. The second is saving proper middle results to reduce repeated computation. The third is making full use of memory to reduce IO operations.I compliment a human head detection and localization system, which performs well according to its accuracy rate, recall rate and time cost of human head detection and localization. ﹀
分类号：	TP391.41
论文总页数：	42
参考文献总数：	25
馆藏号：	017/M2009(183)
公开日期：	2009-06-08

2009-06-06

汉语的自动词义区分研究.朱虹

链接

题名：	汉语的自动词义区分研究
姓名：	朱虹
学号：	10448896
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2009-06-06
外文题名：	Research into Automatic Word Sense Discrimination on Chinese
关键词：	词义区分词义知识表示词义知识库建设搭配聚类
外文关键词：	word sense discrimination word sense representation lexical knowledge base construction collocation clustering
论文摘要：	︿自然语言的语义分析是实现自然语言理解的必要手段，其中面向信息处理用的词义分析一直是自然语言处理的焦点和难点。传统语言学的词义研究主要关注词义的发展和演变。汉语词典关于词语的定义又多是描述解释性的，很难反映词语在真实语料中的词义情况，表现在词语定义粒度过细，缺少新义或者特定领域的词义，存在循环定义现象等等，使得现有的词典无法很好地应用于自然语言处理，成为词义消歧、词汇语义知识库建设等研究的瓶颈。因此，面向信息处理的自动词义区分成为了解决词义知识获取问题的重要研究课题。词义区分可以应用于词义知识库构建、词义消歧、信息检索、机器翻译等不同领域。自动词义区分是通过对真实文本的处理，区分和表示词语词义的过程。自动的词义区分基于著名的分布假设，即词语的词义可以通过词语周围环境获知，利用完全无指导的机器学习方法，自动地从文本中区分出词语，特别是多义词的词义内容，确定词语有多少词义，以及将各个词义以某种形式表征出来。它与词义消歧的不同之处在于，它没有预先定义好的词义列表以及词义的个数。词义区分从1998年被正式提出至今，主要研究集中在英语和一些欧洲语言上，汉语方面的研究很少，应用方面还是空白。并且目前还没有一个同时包含词义区分方法和词义知识表示的完整论述。针对这样的研究现状，本项研究面向中文信息处理，对汉语的词义区分的理论和方法进行了完整的研究和探讨，取得如下主要的包含创造性的研究成果： (1) 作为目前首个关于词义区分方法和词义知识表示的完整论述，本文重新定义了“词义区分”概念，总结并归纳三种词义表示方法以及三种对应的词义区分方法，即基于词聚类的词义区分、基于上下文分组的词义区分和基于搭配的词义区分。(2) 研究并设计实现了汉语的基于词聚类的词义区分算法和基于搭配的词义区分算法，弥补了这方面研究的空缺。针对词义区分评价难的问题，对不同的词义区分算法设计了多方面、多层次的评价方法。例如在基于词聚类的词义区分研究中，提出分别从词聚类和词义区分两方面对结果进行自动评价。在比较不同方法的结果时，除了需要关注对应率、覆盖率等具体指标外，还需要关注不同方法结果的内容交叉情况；在基于搭配的词义区分研究中，提出通过人工相关性评价方法更好地完成评价工作；在词义知识库构建的具体应用中，在缺乏标准答案的情况下，提出利用词义个数分布曲线和词义优选序列来评价最后的结果。这些方法都能较为客观地反映词义区分的实际效果，很好地为词义区分研究服务。 (3) 目前汉语词义区分的研究都集中在名词和动词上，还没有形容词方面的相关研究。本文特别针对汉语形容词提出了新的词义区分方法。特别是在基于词聚类的词义区分研究中，选择了易于获取并能体现汉语形容词语义信息的知识，初始化EM聚类算法的参数以提高其性能。通过引入HowNet进一步优化了词形特征的选择，使实验结果得到了进一步的提升。(4) 针对现有搭配词典的词义划分标准不明、典型搭配不典型、数量少、更新慢等问题，本文将搭配研究和词义区分研究有机结合，利用词语的搭配特征区分词语的词义，同时获取可区分词义的搭配知识。并且本文还提出了新的搭配描述框架。该方法的人工评测结果表明，自动获取的搭配具有明显的词义区分能力，可以为构建大规模搭配知识库奠定基础。(5) 设计并实现了词义区分在双语词汇语义知识库CCD建设中的应用。针对CCD词义定义不确切的问题，使用基于词聚类的词义区分方法实现汉语名词和形容词的词义区分，然后通过词集之间的相互映射，修改CCD现有的词语定义。本文还优化了CCD中形容词概念相似度的计算方法，更好地满足了应用需要。通过评价，实验结果符合汉语的实际情况，并且与人工专家的修改意见基本一致。作者通过在汉语词义区分领域中理论、技术、应用等多方面的研究与实践，为汉语的词义区分研究开拓了新的技术和方法，也为其他语言的词义区分研究提供了研究和应用上的新思路。﹀
外文摘要：	︿ Word sense analysis is always the key issue for both lexicography and lexical semantic processing. Traditional lexicography mainly relies on the theoretical approach. Lexicographers try to sum up all senses of a given word with expertise, and then confirm their belief by looking for its practical usages in corpus. But more and more researches indicate that such induction of word senses is not proper for tasks of Natural Language Processing like word sense disambiguation and automatic corpus annotation. It needs alternative models to build the relationship between usages of a word and the senses a dictionary provides for it. Word sense discrimination, also known as word sense induction or word sense discovery, denotes any empirical approach to discover word sense from real text under the distribution hypothesis, that senses of a word can be discovered from corpus with words surrounding it. It is an unsupervised method without knowing a list of senses from a dictionary, and does not need human intervention. Word sense discrimination is now used in lexicography, word sense disambiguation and information retrieval etc.There are some researches on word sense discrimination for English and other languages from 1998. But there is little work on Chinese. So this paper aims to give a complete discussion about word sense discrimination from word sense represention theory to discrimination thechnologies and applications.The innovations of this dissertation are as follows: (1) Illuminate the concept of word sense discrimination. Sum up three word sense representation forms, and propose three word sense discrimination methods, which are word-clustering-based word sense discrimination, context-clustering-based word sense discrimination and collcation-clustering-based word sense discrimination. (2) It proposes new word-clustering-based and context-clustering-based word sense discrimination methods. It also designs different evaluation methods, including automatic methods and manual ones. (3) It focuses on Chinese adjectives sense discrimination. In word-clustering-based method, exploit the features of Chinese character, contextual bag-of-words and host-attribute pair instead of the previously more unreachable syntactic information for optimizing EM algorithm by initializing the parameters. Then go further to optimize the morphology selection by utilizing HowNet in the work.(4) Faced with the problems of collocation resource and dictionaries, we propose a new collocation-clustering-based word sense discrimination method to automatically discover collocation clusterings which could be used for discriminating senses of target words.(5) It applys word sense discrimination method to solve the issue of redundant word senses within a lexicon, which, as a common phenomenon, has seriously influenced the quality and availability of most Wordnets under construction. It combines ways of unsupervised word sense discrimination and automatic mapping measures to eliminate such cases in the lexicon. It also optimizes the similarity measure of CCD and put forward two novel strategies to automatically evaluate the model’s performance. The experimental results show a rather fine conformance to the work based on human judgments. ﹀
分类号：	TP391
论文总页数：	131
参考文献总数：	119
馆藏号：	048/D2009(33)
公开日期：	2009-06-06

词义消歧若干关键技术研究.金澎

链接

题名：	词义消歧若干关键技术研究
姓名：	金澎
学号：	10548879
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2009-06-06
外文题名：	Researches on Some Key Issues of Word Sense Disambiguation
关键词：	词义消歧核方法词相似词聚类主动学习义项分布
外文关键词：	Word Sense Disambiguation Kernel based Method Word Similarity Word Clustering Active Learning Sense Distribution
论文摘要：	︿词义消歧是计算语言学领域的基础研究课题之一，长期以来在机器翻译中扮演重要角色。在Internet迅速扩张的今天，词义消歧也是提高信息检索性能的重要途径。本文针对有指导词义消歧中的数据稀疏问题和无指导词义消歧中的义项分布估计问题进行研究，主要工作如下：（1）基于词相似缓解数据稀疏。本文首次将词语相似度（Word Similarity）有效地集成到核方法这一被普遍采用的有指导词义消歧方法中。首先在ACL SIGLEX组织的SemEval2007和SENSEVAL2两次国际评测的英语采样词任务上验证了该方法的有效性。进一步在LDC （Linguistic Data Consortium）的Chinese Gigaword语料上，完成汉语词相似度计算，并验证该方法在SemEval2007评测的汉语采样词任务上的有效性。（2）基于词聚类缓解数据稀疏。将词聚类和基于决策表的搭配消歧相结合。目前几乎所有的高质量词义标注语料库都是人工建造的，该方法旨在减轻词义标注语料库建设中的人工标注工作量。基于决策表的搭配消歧具有高准确率的优点和低召回率的缺点。词聚类的结果用来扩展决策表，实验结果表明这种方法在几乎不损失准确率的前提下，召回率提高了20个百分点。（3）扩大词义标注语料库规模缓解数据稀疏。改变传统的根据多义词在语料库中的出现顺序，依次提交给标注员标注的做法，本文通过主动学习（Active Learning）让系统挑选出那些信息量大的待标注句子优先提给标注人员。在投入相同人工标注工作量的前提下，根据后者提供的标注语料训练得到的分类器性能更优。本文首先验证主动学习在汉语WSD中的有效性，并根据WSD特点提出一种基于特征增加的度量样本信息量的方法。结合该方法和边界采样方法，改善了主动学习的效果。（4）自动估计多义词各义项在语料库中的分布。词义的分布通常是不平衡的，通过无指导的方法估计义项分布可以改善有指导的WSD，也可以提示WSD系统根据当前具体的上下文进行消歧，抑或直接标注最常用义项（Most Frequent Sense, MFS）。在Senseval2英语所有词任务和Semcor1.6数据上进行实验。结果表明在自动估计义项分布越不平衡的多义词上，直接标注 MFS的准确率越高。本文的研究对如何将统计模型和语言学知识有机结合做了有益的探索。这对构建高性能的词义消歧系统有直接的指导意义，也为建设大规模词义标注语料库提供了高效率的方法。本文的部分研究成果对计算语言学习领域的其他任务，如语义角色标注、隐喻识别等也将有借鉴意义。﹀
外文摘要：	︿ Word Sense Disambiguation (WSD) is one of fundmental open problem in Computational Linguitics. It has been boosting the Machine Translation since 1950s and now benefits the Web search engines for Internet. This paper focuses on alleviating the data sparseness for the supervised method in WSD. As well, the unsupervised method for WSD is also investigated. The main points are listed as follows: (1) Similarity-based method to alleviate data sparseness. We propose a novel approach to improve the kernel-based WSD. We first explain why linear kernels are more suitable to WSD and many other natural language processing problems than translation-invariant kernels. Based on the linear kernel, a distributional similarity thesaurus is used to alleviate data sparseness by generalizing crucial features when they do not match the word-form exactly. The experiments show that we have outperformed the state-of-the-art system on the benchmark data from English lexical sample task of SemEval-2007 and the improvement is statistically significant. Furthermore, the experiments on SENSEVAL-2 English Lexical sample task and Multilingual Chinese-English Lexical Sample Task of SemEval-2007 also show the power of this method.(2)Class-based to alleviate data sparseness. The main disadvantage of collocation-based word sense disambiguation is that the recall is low, with relatively high precision. How to improve the recall without decrease the precision? We investigate a word-class approach to extend the collocation list which is constructed from the manually sense-tagged corpus. But the word classes are obtained from a larger scale corpus which is not sense tagged. The experiment results have shown, the recall is improved twenty percents although the precision decreases slightly. (3)Active learning to alleviate data sparseness. The end of active learning is to obtain better performance than the random sampling in the case of the same amount of labeled data. The experiments on Chinese WSD show the effect of min-margin sampling and then a new feature-based sampling is brought forward. Combining the margin-based sampling and feature-based sampling, we improve the results of active learning. (4)Auto estimate the sense distribution. Word sense distributions are usually skewed. Predicting the extent of the skew can help a WSD system determine whether to consider evidence from the local context or apply the simple yet effective heuristic of using the first (most frequent) sense. We propose a method to estimate the entropy of a sense distribution to boost the precision of a first sense heuristic by restricting its application to words with lower entropy. We show on two standard datasets that automatic prediction of entropy can increase the performance of an automatic first sense heuristic.In this paper, we investigate to how to integrate the linguistcs knowledge into the statitstic models. Not only the WSD systems are improved, but word sense annotation is benefited. We think that some researches will boost other tasks in natural language processiong such as semantic role labeling and metaphor recognization. ﹀
分类号：	TP391.1
论文总页数：	127
参考文献总数：	117
馆藏号：	048/D2009(45)
公开日期：	2009-06-06

2009-06-05

皮钦语，克里奥尔语与二语习得－－－从皮钦英语到中国英语.邹强

链接

题名：	皮钦语，克里奥尔语与二语习得－－－从皮钦英语到中国英语
姓名：	邹强
学号：	A0739004
专业：	英语语言文学
公开时间：	公开
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2009-06-05
外文题名：	Pidgins,Creoles and Second Language Acquisition
关键词：	皮钦语克里奥尔语二语习得中介语中国洋泾浜英语中国式英语中国英语
外文关键词：	Pidgins and Creoles Second Language Acquisition Interlangauage Chinese Pidgin English Chinglish China English
论文摘要：	︿本论文从澄清概念、描述特征的角度出发分别描述了世界已经死亡的和依然存在的皮钦语，克里奥尔语的特征以及他们与二语习得过程相似性的特点，并且延伸到中国洋泾浜英语，中国式英语和中国英语具体概念的界定和区分，最后提出有益于二语习得及英语学习的建议。本论文主要是在前人的研究基础之上运用比较的方法从理论上去探讨皮钦语，克里奥尔语以及二语习得各自所拥有的词汇和语法特征，最后又讨论了中国英语所拥有的特征及其与中国式英语和洋泾浜英语之间的区分。本文引用和列举了大量有关这些语言特征的研究例证，并进行了有效的逻辑和对比分析。﹀
外文摘要：	︿ This thesis tries to clarify definitions of Pidgins and Creoles, and describe the linguistic features of Pidgins, Creoles and second language acquisition. Following that, the linguistic features of Chinese Pidgin English, Chinglish and China English will be adduced to reinforce the point, from the relevant literature at two levels of the lexicon and the grammar, that they are very different. The previous research by predecessors will be discussed in light of systematic comparison and logical analysis. Our findings suggest that China English has the property of openness, and it can absorb many useful words or usages from Chinese Pidgin English, even Chinglish and other languages and expand into a full-fledged language. The remarkable features of Chinese Pidgin English can help learn English in China. ﹀
分类号：	H
论文总页数：	123
参考文献总数：	199
馆藏号：	039/M2009(95)
公开日期：	2009-06-05

2009-06-01

基于Web的汉语巨型辅助教学系统的设计与实现.王更生

链接

题名：	基于Web的汉语巨型辅助教学系统的设计与实现
姓名：	王更生
学号：	A0617249
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
论文答辩日期：	2009-06-01
关键词：	句型语言教学 CALL 语言知识库自然语言处理交互式学习句子相似度编辑距离算法
论文摘要：	︿本文把《现代汉语句法树库》和《结构化句型数据库》这两个语言知识库引入汉语句型教学，并结合自然语言处理技术、Javasript技术、文档对象模型(DOM)，以及矢量标记语言(VML)和可扩展矢量图形(SVG)技术，实现了一个基于Web的人工智能交互式汉语句型辅助教学系统。《现代汉语句法树库》是面向自然语言处理的语言知识库，《结构化句型数据库》可以归为学习者语料库，在本文系统中，实现了两种不同类型的语言知识库在语言教学系统中的融合；这使学习者不仅可以使用句型数据库的内容进行学习，而且可以同时方便地参照句法树库中相应的语言学知识。从而实践了“发现式学习”这种比较典型的以学生为中心的教学模式。本文充分利用《现代汉语句法树库》中包含的语言学知识，结合Javascript、DOM、VML、SVG技术实现了方便的交互式汉语句法学习；这一方面拓展了面向自然语言处理的语言知识库的教学用途，另一方面对汉语句型教学提供了一种比较便利、比较直观的辅助教学工具。本文研究了句子相似度计算的相关算法，根据系统的需求选择并改进了其中的字符串编辑距离算法及树编辑距离算法，将它应用在用户句法结构分析中的综合打分环节，使这个句型辅助教学系统具备了初步的智能化特征。同时，系统使用Javascript技术结合句法树库中的语言学知识，对用户分析的结果进行动态分析，把其中的错误实时地反馈给用户。以上工作在一定程度上实践了偏误分析在计算机辅助语言教学(CALL)中的应用。句型教学是汉语语法教学的重要组成部分，特别是在对外汉语教学中，学习者多为成年人，而且他们的母语大多有比较规范的句法结构，句型教学显得尤为重要；作为一个应用项目，笔者希望本文所实现的系统能为从事汉语教学的教师和汉语学习者提供一种有益的辅助。﹀
分类号：	TP311.52
论文总页数：	70
参考文献总数：	0
馆藏号：	017/M2009(717)
公开日期：	2009-06-01

基于语义相似度计算的汉语词义区分探索.闫国华

链接

题名：	基于语义相似度计算的汉语词义区分探索
姓名：	闫国华
学号：	10517208
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
导师1单位：	软件与微电子学院
论文答辩日期：	2009-06-01
关键词：	目标词搭配词语义区分语义相似度层级聚类
论文摘要：	︿词义区分是词义消歧的基础，而词义消歧在诸如机器翻译、文本分类、信息检索等多个领域有重要应用，因此对词义区分的研究很有必要，但现在对它的研究还处在初步探索阶段。词的上下文环境中含有对词义区分贡献丰富的信息，其中搭配词又具有重要的语义区分价值。本文正是通过考察目标词在其上下文环境中的搭配词，对搭配词进行计算，对汉语词语的词义区分进行了初步的尝试。本文选取了十个汉语词作为目标词，抽取了这些词在分词并标注了词性的《人民日报》语料库(半年)中的搭配词，并参照《语法讲义》标注了结构；以《同义词词林》(扩展版)为分类标准，计算搭配词两两之间在词林中的欧几里德距离，并变换为相似度值，形成相似度矩阵，然后对相似度值矩阵进行层级聚类，形成对这些搭配词的划分。将此划分作为中心词的一种语义区分，观察其区分效果。本文所作的工作是对汉语词义区分的初步尝试，所使用的方法有一定的创新性。实验结果表明，本文的方法具有一定的创新性和参考价值。﹀
分类号：	H087
论文总页数：	65
参考文献总数：	0
馆藏号：	017/M2009(811)
公开日期：	2009-06-01

2008-12-01

基于角色词典的机构名识别.张伟伟

链接

题名：	基于角色词典的机构名识别
姓名：	张伟伟
学号：	A0617077
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞士汶
论文答辩日期：	2008-12-01
关键词：	命名实体机构名识别条件随机场角色词典简称
论文摘要：	︿命名实体(Named Entity，NE)是文本中基本的信息单位，是文本中的固有名称、缩写及其他唯一标识，是正确理解文本的基础。命名实体识别是文本信息化的基础，它的研究成果将直接影响到文本信，g自动化处理的深层次研究。机构名识别是命名实体识别的重要任务之一，属于自然语言处理的基础研究领域，是信息抽取、信息检索、机器翻译、组块分析、问答系统等多种自然语言处理技术的重要基础。因此，对机构名识别的研究具有很大的实用意义。本文针对机构名的组成特点，在对机构名组成机构进行细致研究的情况下，利用CRF算法构造了一个通用的可持续改善的机构名识别系统，进而通过构造领域角色词典，满足领域专业机构名识别的需要。具体来说，本文的主要内容如下：本文首先分析了中文机构名识别的困难，和现有机构名识别算法的不足。通过对中文机构名组成方式详细地分析，将角色词典的概念引入机构名识别系统。接着本文详细介绍了条件随机场的定义、模型结构、参数估计等。进一步地，将条件随机场模型应用于中文机构名识别任务，提出了适合于中文命名实体的特征模板，并通过实验进行验证。最后本文实现了一个通用的、可持续改善的机构名识别系统，并在这个系统的基础上构造各领域的领域机构名识别系统。实验表明，本文提出并实现的机构名识别系统不仅在各项评价指标上达到了较高的水准，更重要的是符合机构名随时间演变的规律，可以不断改善。﹀
分类号：	TP311.5
论文总页数：	51
参考文献总数：	0
馆藏号：	017/M2008(682)
公开日期：	2008-12-01

2008-06-12

综合型语言知识库系统原型的开发与中文缩略语知识库建设.支流

链接

题名：	综合型语言知识库系统原型的开发与中文缩略语知识库建设
姓名：	支流
学号：	10548273
论文语种：	chi
专业：	计算机软件与理论
公开时间：	3年后
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2008-06-12
关键词：	自然语言处理综合型语言知识库部件词缩略语缩略语知识库
论文摘要：	︿本文的研究工作是围绕综合型语言知识库建设展开的，包括两部分：综合型语言知识库系统原型的开发与中文缩略语知识库建设。北京大学计算语言学研究所（ICL/PKU）十多年来积累了大量的语言资源。由于各个资源是独立开发的，使得逻辑上原本联系紧密的各个资源之间交叉参照困难，且无法方便地进行知识挖掘。为解决这些问题，需先填平各项资源之间的“缝隙”，然后将这些资源放在同一平台上，使得它们可以方便进行交叉参照；同时建立数据挖掘软件，发现新知识，也就是建设综合型语言知识库系统。本文首先介绍了综合型语言知识库系统原型实现的规划和步骤，然后介绍了为填补各项资源之间缝隙而建设的部件词库及词类标记集转换表，最后详细介绍了综合型语言知识库系统原型主体部分的建设。缩略语是自然语言语汇的重要组成部分，缩略语研究也是自然语言处理的一个重要课题。本项研究的最终目标是探索中文缩略语的规律，包括缩略语的生成和还原。本文的工作旨在建设计算机自动处理中文缩略语所需的知识库。利用北大计算语言所的两大基础语言资源《现代汉语语法信息词典》和“大规模基本标注语料库”，建设了中文缩略语知识库，收录了八千条缩略语及其对应的全称，提出了面向信息处理的中文缩略语分类框架，完成了相当数量的缩略语归类，并根据计算机自动处理缩略语的需要建设了缩略语-全称对的特征词自动提取程序，为缩略语库知识库中每一个缩略语-全称对自动填写特征词。﹀
分类号：	TP391
论文总页数：	59
参考文献总数：	20
馆藏号：	048/M2008(229)
公开日期：	2011-06-12

2007-12-01

大学生英语听力学习策略探索——石河子大学个案研究.李蕾

链接

题名：	大学生英语听力学习策略探索——石河子大学个案研究
姓名：	李蕾
学号：	A0639009
论文语种：	chi
专业：	英语语言文学
公开时间：	公开
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2007-12-01
外文题名：	A Survey on the Listening Strategies of College English Learning and Teaching in Shihezi University
关键词：	学习策略听力理解大学英语石河子大学
论文摘要：	︿在20世纪60年代，人们通常认为说和写是主动性技能，听和说是被动性技能。随着第二语言习得理论和言语理解理论的发展，20世纪80年代以来，研究者们对言语理论作了大量的实验。这些研究成果显示，听力理解已经不是一个被动的接受信息的过程，而是一个积极主动的心理认知过程，同时它还是一个相当复杂的心理语言过程，是多种语言能力、背景知识和思维能力协同作用的结果(邓道宣，2004)。西方二语／外语听力理解策略研究以学习策略为理论框架，把O'Mailev&Chamot(1990)所倡导的元认知策略、认知策略和社会／情感策略看作听力理解的三种主要策略。元认知策略包括计划和自我监控和评价；认知策略包括自觉学习的方法；社会策略指的是学会如何与他人合作或求助；情感策略指在听的过程中如何降低自身的焦虑情绪。通过实验与研究，策略训练有助于激发学生的二语／外语学习兴趣，有助于进一步提高学生的二语／外语听力水平。国内对听力策略的研究主要在经济发达地区的高校进行。文秋芳、束定芳、刘绍龙等认为，认知策略与元认知策略对提高听力理解有着重要意义。﹀
分类号：	H319.9
论文总页数：	69
参考文献总数：	0
馆藏号：	039/M2007(78)
公开日期：	2007-12-01

语法隐喻对英语教学的启示.陈荣泉

链接

题名：	语法隐喻对英语教学的启示
姓名：	陈荣泉
学号：	A0639010
论文语种：	chi
专业：	英语语言文学
公开时间：	公开
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2007-12-01
外文题名：	Grammatical Metaphor and Its Implication on ELT
关键词：	语法隐喻语域语体启示英语教学
论文摘要：	︿本文首先从隐喻谈起，探讨语法隐喻。考虑到对语篇隐喻目前尚无定论，本文只探讨概念隐喻和人际隐喻。本文以系统功能语法为理论框架试图揭示语域、文体和语法隐喻之间的关系，认为语法隐喻存在于不同的语体，当语场和语旨发生非一致体现时，相应就产生了概念隐喻和人际隐喻。通过对修辞疑问句的分析，本文认为修辞疑问旬属于人际隐喻中的一种情态隐喻。本文还通过分析学生写作和翻译中经常出现的错误认为由于对语法隐喻不了解，特别是对名词化的使用掌握不好，造成学生的写作和翻译有着浓重的口语体色彩。这些错误还源于对语域、文体差异的不敏感。本文试图把语法隐喻理论运用到文学语篇分析中，认为语法隐喻有助语揭示人物性格和人物关系。除此之外，还把语法隐喻理论运用到商务英语语体分析中。通过以上分析，本文认为，语法隐喻在二语习得中起着非常重要的作用。语法隐喻能力的形成语第二语言读写能力有着密切的关系，因此在教学中，加强语法隐喻的教学将有助于提高学生的读写能力。﹀
分类号：	H314
论文总页数：	42
参考文献总数：	0
馆藏号：	039/M2007(79)
公开日期：	2007-12-01

2007-06-13

英语Be-existential plus Relative结构与含义浅析.朱希滨

链接

题名：	英语Be-existential plus Relative结构与含义浅析
姓名：	朱希滨
学号：	10439034
论文语种：	eng
专业：	英语语言文学
公开时间：	公开
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2007-06-13
外文题名：	Be-existential plus Relative: Its Structure and Meaning
关键词：	存在句 Be-existential plus Relative句型定语从句
外文关键词：	existential sentence be-existential plus relative relative clause
论文摘要：	︿ Be Existential-with-relative是英语存在句的一种。为了对它的结构和含义进行系统深入的分析，首先要对英语存在句有基本的认识。本文第一章从存在句的结构、定义、来源、存在句中there的特性、存在句的功能、分类、句法分析、second predication的概念及对应句等八个方面展开，对存在句的基本语言特征作出总体的描述，同时有针对性地详细论述若干问题，为下文分析Be Existential-with-relative 打下基础。第二章是对Be Existential-with-relative的结构和含义的具体描述和分析，主要包括结构分析、从含义角度的分类、句法分析和语义特性等四个方面。结构分析的重点在从句和与其共现的其他成分，如reduce relative等。从含义的角度提出根据从句动词的特点，可以分辨出一种表示介绍事件发生的子类，与一般强调事物存在的情况相区别。句法分析中介绍了四种现有的思路，之后依据Be Existential-with-relative的最大扩展形式，分析确认从句的修饰语地位。语义特性分析通过对大量例句的解析，提出此类句型在含义和功能上的若干特征。第三章主要讨论与Be Existential-with-relative密切相关的三个尚无定论的议题：reduced relative、从句的句法定位和对应句。其中reduced relative和对应句都是可以独立的论题，文章没有对其进行全面考察，而是仅就它们与Be Existential-with-relative有关的现象尝试提出解释。﹀
外文摘要：	︿ This thesis is an attempt to account for the structure and meaning of be existential-with-relative in English. To accomplish this, a general outline of the existential sentence is first given to provide a basis for the following discussions, in the process of which special attention is directed towards those issues most relevant to be existential-with-relative. The second part of the thesis centers around major discussions on the structure and meaning of be existential-with-relative. On the basis of analyses of concrete examples, discussion on the structural pattern, the semantic classification, syntactic analysis and semantic features are carried out in turn. The main conclusions include the modifier status of the relative clause and the fact that be existential-with-relative is rich in semantic peculiarities. Chapter Three focuses on three topics of controversy in the analysis of be existential-with-relative, namely the reduced relative, the treatment of the relative clause and the non-existential counterpart. Some tentative speculations are put forward about certain issues related to these three topics. ﹀
分类号：	H31
论文总页数：	82
参考文献总数：	42
馆藏号：	039/M2007(34)
公开日期：	2007-06-13

面向文本聚类的相似度计算方法研究.王洪俊

链接

题名：	面向文本聚类的相似度计算方法研究
姓名：	王洪俊
学号：	10308829
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2007-06-13
外文题名：	Researches On Similarity Method For Text Clustering
关键词：	文本相似度多特征集成文本聚类有监督学习语义和语言学知识
外文关键词：	document similarity feature combination text clustering supervised learning semantic and linguistic knowledge
论文摘要：	︿作为一种基于无监督学习的知识获取方法，文本聚类是文本挖掘领域的一项重要技术。文本聚类在文档组织、信息检索、话题检测与跟踪等诸多领域都得到了普遍的应用，受到研究者的广泛重视，具有重要的研究价值。如何提高聚类质量是当前文本聚类研究面临的最大挑战。本文的目标是使用有监督机器学习方法集成不同的相似度计算方法和各种语言学知识，通过优化相似度计算来提高文本聚类质量。本文在多个标准测试集上，对文本聚类的相似度计算方法进行了系统研究，主要取得了如下一些研究成果：（1）研究了基于统计语言模型的文本聚类方法，并与向量空间模型进行了对比。比较了几种常见的数据平滑方法在文本聚类中的效果。针对统计语言模型的参数估计易受文档集影响的特点，提出一种把背景语料库的分布知识融入文本聚类计算的方法，有效地提高了文本聚类效果。（2）将有监督的机器学习方法引入文本聚类，提出一种多特征集成的相似度打分方法，采用两种有监督学习方法训练打分系统的参数：支持向量机的方法和启发式搜索的方法。在此基础上，提出对不同特征相关度值进行规格化处理以及对参数搜索空间进行限制两种改进措施，提高了参数学习效率。实验结果表明，该打分系统可以有效地实现多特征的融合，并提高文本聚类效果。在此基础上引入各种语言学知识和语义知识，进一步改进文本聚类效果。把有监督的机器学习引入无监督的文本聚类，为提高文本聚类质量提供了一种新的研究思路。这是本文的重要创新之处。（3）将语义关系与文本聚类相结合。本文将中文语义词典知网用于文本聚类，在实验多种语义概念与向量空间模型的结合方法的基础上，提出了一种语义关系和词语特征结合的多特征集成方法。同时，为了解决语义概念映射时概念映射级数与噪音同步增长的问题，提出一种基于概念相似度的权重调整算法。实验结果表明：两种方法均可以有效改进文本聚类效果。（4）研究了多种文本表示单元在中文文本聚类中的效果，发现词、单字和双字特征是最好的三种文本表示单元。单字、双字和词三种特征具有互补性，但简单地把特征混合叠加到一起对聚类效果没有改善。本文提出一种基于线性加权的多特征集成方法，把三种特征融合到一起，有效地提高了文本聚类效果。（5）将语言学知识引入文本聚类。本文比较了不同词类特征对于文本聚类的影响。实验结果表明：名词和动词是最重要的两种词类特征，仅用这两种特征就可以取得比较好的聚类效果。使用词类特征可以过滤掉很多特征，客观上起到特征选择的作用。本文将有监督机器学习和各种语言学知识引入文本聚类的研究与实践，为提高文本聚类质量的研究开拓了新的思路。﹀
外文摘要：	︿ As an effective unsupervised learning method for knowledge acquisition, text clustering is an important technology for text mining. It has been widely used in diversified fields including document organization, information retrieval, topic detecting and tracking, etc. Text clustering has become an important research area and gained more focus.How to improve the quality of text clustering is still a big challenge. This paper aims to introduce supervised learning methods to combine different document similarity methods or different linguistic features to optimize the quality of text clustering. In this paper, we study similarity method for text clustering using several public Chinese Text Categorization datasets as test sets.In summary, we achieve several results as follows:1. An SLM-based (statistical language model) text clustering method is proposed in this paper. SLM method is used in text clustering as a document similarity method and is compared with traditional VSM method (Vector space model). Several smoothing methods are also investigated in SLM-based text clustering. Another method using a large background corpus to estimate reference language model is proposed and makes better results.2. We introduce supervised learning into text clustering and propose a new document similarity method with feature combination. To train this method, we present two supervised parameter-learning algorithms including liner search method and Support Vector Machines. Score normalization and search space limitation are proposed to accelerate the parameter learning. Experiments show that this feature combination method can effectively improve text clustering by combining the outputs of different document similarity methods. This feature combination method provides a flexible framework to incorporate arbitrary features such as linguistic features and use a supervised learning method to improve text clustering. It's an important innovation of this paper. 3. We apply Chinese Semantic Knowledge Base: HowNet in text clustering and try to use semantic knowledge to improve text clustering. We compare several methods to combine semantic feature and word feature, and propose a feature combination method to combine semantic concept and word feature. Mapping a word to more concepts with different semantic distances can utilize more semantic relationships, but more noise information will be introduced. To overcome this difficulty, a concept similarity method is proposed to adjust concept weights when more concepts are used. Experiments show both two methods are effective to text clustering.4. We investigate different text representing units in Chinese text clustering and found Chinese word features, character unigram features and bi-gram features most effective in our experiments. The method of merging different features doesn't improve the results. We apply the feature combination method to combine three features above and improve the experimental results. 5. We use linguistic features in text clustering and compared different POS features for text clustering. Experiments show that nouns and verbs are the most important POS features for Chinese text clustering. POS features can be used to remove useless features, so they play an important role in feature selection.This paper has put forward a new view for the text clustering through the studies and the practices on using supervised learning method to combine different similarity methods or linguistic features to improve text clustering. It has explored new thought for text clustering. ﹀
分类号：	TP311.13;TP393.4
论文总页数：	147
参考文献总数：	128
馆藏号：	048/D2007(29)
公开日期：	2007-06-13

2007-06-12

最简方案理论框架下的英汉被动式句法研究.明小天

链接

题名：	最简方案理论框架下的英汉被动式句法研究
姓名：	明小天
学号：	10439018
论文语种：	eng
专业：	英语语言文学
公开时间：	3年后
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2007-06-12
外文题名：	A Syntactic Study of English and Chinese Passives within the Framework of the Minimalist Program
关键词：	普遍语法最简方案被动式 “被”字结构功能性成分
外文关键词：	Universal Grammar Minimalist Program Passive Construction BEI Construction Functional Category
论文摘要：	︿本文以乔姆斯基普遍语法为基础，在最简方案理论框架下对英语、汉语被动结构进行句法研究。文章首先讨论了英语、汉语被动式的定义，对本文的研究对象做出明确界定。继而结合生成语法的不同阶段对英语被动式的研究做了简要回顾，并在最简方案的框架下，简述了英语被动结构生成的主要方面。文章对汉语“被”字结构的生成句法研究进行了批判性的总结，在此基础上，作者认为，汉语的被动标记“被”并非介词或动词，而是表示“受到影响”的功能性成分。文章将汉语“被”字结构的生成描述为功能性成分“被”的最大投射，并基于此假设对于汉语“长被动句”、“短被动句”和“保留宾语”被动句的句法生成，做出了统一的描述。﹀
外文摘要：	︿ Based on Chomsky’s Universal Grammar, the author conducts a syntactic study on English and Chinese passive constructions within the framework of the minimalist program. The thesis begins with the discussion of the definitions of English and Chinese passives, and thus makes clear the object of this research. Then the author gives a brief review on the analyses of passive constructions in different periods of generative grammar, and focuses on the derivation of English passive constructions within the minimalist framework. Based on a critical review of generative analyses of Chinese BEI construction, the author argues that the passive marker BEI should be a functional category expressing “affectedness” instead of a lexical word such as a verb or a preposition. Thus the derivation of BEI construction is described as the syntactic projection of BEI. Under this hypothesis, the author gives a uniform analysis of Chinese long passive sentences, short passive sentences and retained object phenomenon. ﹀
分类号：	H31
论文总页数：	62
参考文献总数：	44
馆藏号：	039/M2007(18)
公开日期：	2010-06-12

2006-12-14

面向问答系统的情感倾向分析研究.苏祺

链接

题名：	面向问答系统的情感倾向分析研究
姓名：	苏祺
学号：	10208847
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2006-12-14
外文题名：	Research on Sentiment Orientation Analysis for Question Answering Applications
关键词：	自动问答文本情感分析情感倾向实体评价特征文本分类
外文关键词：	question answering text sentiment analysis sentiment orientation entity features text classification
论文摘要：	︿随着用户信息需求的不断增长，需要目前的web检索系统能够为用户提供更加有效、更富个性化的检索服务。其中问答式信息检索作为一种能够接受用户自然语言提问，并返回问句直接答案的新型检索技术，成为了检索系统发展的必然趋势。然而目前的自动问答研究大多关注于对客观型问题的回答方法，对如何回答用户的主观观点型提问则鲜有涉及。对观点型提问的回答要求系统具有自动计算文本中蕴含的情感倾向的能力。本文设计了一个回答用户观点型提问的问答式信息检索系统(Opinion QA系统)。研究目标在于：对web实体评价数据中所包含的评价内容进行自动识别与计算。并将该任务划分为两个子任务：1).识别文本中蕴含的情感倾向及强度；2).识别情感的评价对象。对于第一个子任务，利用了手工构建并自动扩展的情感倾向词语资源、语言表述模板，并将上述语言信息融入自动文本分类技术来开展研究。对于第二个子任务，利用机器学习的聚类技术与统计互信息方法来挖掘文本中的实体评价特征及其层次结构。上述研究为观点型提问的解决提供了支持。通过对文本中蕴含的情感倾向进行分析，Opinion QA系统能够处理用户的观点型提问，并依据情感倾向的分析结果，为用户提供一个实体在web上的公众评价意见概要。为现有的检索系统服务提供重要补充。本文的主要贡献包括：1.提出了一种新型的检索系统模式—-观点型问答系统；2.构建了目前中文文本情感分析研究中所缺乏、却对研究的开展有重要基础性意义的语言资源，包括情感倾向词语资源和情感倾向表述模板资源；3.将自动文本分类技术应用于中文文本情感分析，将语言资源与机器学习的分类技术相结合，提出文本中情感倾向的识别方法；4.提出实体评价语料中评价特征的抽取及识别方法；5.针对观点型问答系统，提出了新型的系统效果评测方法。作者通过对实体评价语料中情感倾向分析的研究与实践，为个性化web检索研究提供了新的研究视角，并为中文文本情感分析开拓了新的思路。在本项研究中所积累的语言资源也可以为今后的文本情感分析研究提供基础支持。尽管本文以问答系统作为应用框架，并以受限领域语料作为实验对象，但本文提出的主要技术可以方便的运用和移植到其他类似应用系统中，例如：客户关系管理、自动产品推荐、有害信息过滤、社会舆情分析等等。具有广泛的应用前景。﹀
外文摘要：	︿ With the growth of users’ information requirements, web search system must provide more effective and personalized search services. Question Answering as a new retrieval technology which could accept natural language questions and and return exact answers to the questions, has been the advanced stage and the inevitable trend of information retrieval technology. Most of question answering research has focused on objective questions. However, here are few studies on subjective questions. To answer these kinds of questions, a question answering system need to have the ability to calculate the sentiment in text.This paper propose a question answering system(Opinion QA system) designed to respond to opinion related questions. The objective of our research is to identify and calculate the reviews in entity evaluation web corpus. We divide the task into two subtasks:1).identify the sentiment orientation and strength in text; 2).identify the objects of sentiments. For the first subtask, author make the research by the sentiment lexicon,which is manually built and further constructed automatically, and the sentiment pattern database. And then integrate these language resources into text classification technology. For the second subtask, author use the automatic clustering approach and an improved mutual information approach to mining entity features and their hiberarchy.The researches mentioned above provide supports to settle the opinion related questions. By the analysis of sentiment orientation in text, Opinion QA system could deal with opinion related questions proposed by users, and provide a summary of public opinion on the web according to the result of sentiment analysis. It will provide supplements to existing IR system services.The innovations of this dissertation are as follows： 1.It proposes a new pattern of the web search system, that is, opinion question answering system; 2.Construct Chinese sentiment language resources, which is absent in the present research of Chinese text analysis but is indispensable for the researches, including Chinese sentiment lexicon and the sentiment pattern database;3.Utilize the text classification methods in Chinese text sentiment analysis. Combine the language resources and the machine learning approach to identify the sentiment orientation in text;4.Put forward a method to extract and identify the entity features in entity review corpus; 5.Aim at the opinion QA system, propose a new system evaluation method.In this dissertation, the author has put forward a new view for the personalized information retrieval through the studies and the practices on the sentiment orientation analysis in entity review corpus. It has explored new thought for Chinese text sentiment analysis. The resources of this research provides a base for the successive experiments.Although the researches in this dissertation base on the framework of question answering system and deal with the corpus in a specific domain, the main technology proposed can be used in other related application systems and be adapted to other domains, such as custom relation management, automatic product recommend, hostile messages filtering, social public opinion analysis and so on. ﹀
分类号：	G252.7
论文总页数：	130
参考文献总数：	123
馆藏号：	048/D2007(19)
公开日期：	2006-12-14

面向领域本体进化的术语提取与术语层次关系发现.何燕

链接

题名：	面向领域本体进化的术语提取与术语层次关系发现
姓名：	何燕
学号：	10208841
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	穗志方
导师2单位：	信息科学技术学院
论文答辩日期：	2006-12-14
外文题名：	Term Mining and Acquisition of Hierarchy Information Oriented for Domain Ontology Enrichment
关键词：	术语自动提取术语层次关系自动提取 ADTree 概念格
外文关键词：	term mining automatically hierarchy relation acquiring ADTree Formal Concept Analysis
论文摘要：	︿领域本体的建造与进化是近年来的热点问题之一。从哲学和逻辑学的角度看，本体的实现是自莱布尼茨以来许多科学家的梦想，它基于这样一种思想：如果我们能建立一个符号系统，系统中的元素表示的都是概念、范畴，那么我们仅凭符号演算，就可以确定用这个符号系统写成的句子的意义为真或者为假。人们期待本体在人工智能中能够发挥重要作用，从目前的情况看，本体也确实已经在数字化图书馆建设、信息检索等各方面起着越来越重要的作用了。人类知识的全部本体由一个个小的本体构成，领域本体是其中最重要最基本的子集之一。本项研究针对领域本体进化中两个基本问题：术语提取和术语层次关系提取，全面考察了术语的相关语言学规律，尤其提出了如何将语言学规则与统计方法相结合，如何在相关任务中应用更细粒度的知识提高效率，如何从语言学背景出发，为机器学习选取更适宜的语言学特征等。本文的上述研究不仅有益于本项任务的探索，也会对其他相关研究奠定良好基础。本文在研究过程中取得如下创新成果：1）知识颗粒度的细化和相关语言学特征的抽取，是与算法同等重要的内容，如何获得和应用更加细粒度的知识，如何挖掘出更有效的语言学特征，是数据挖掘中不可忽视的问题之一。基于作者的语言学背景，本文对术语提取和术语层次关系提取中所涉及的语言现象，进行了详细的描写和分类。本文第一次对单词术语从语法和语义上进行分类和特点描述，对双词和三词的术语语义组合模板进行了标注与统计，并细致深入的考察了术语层次关系在语法和语义不同层面上的特点、分布和表现，从而为进一步进行术语提取和术语层次关系提取奠定了基础。第三章和第四章主要显示了细粒度语言学知识对于达成目标的帮助，第五章和第六章主要显示了按照语言学的整体框架，从语法和语义两个平面，针对具体问题，抽取出合适的语言学特征对于达成目标的帮助。2）提出了术语部件语义模型，并对术语部件库进行了语义标注。本文在已有的部件库成果基础上，设计了与本体一致的术语部件语义模型，并对术语部件进行了语义标注。扩展后的术语部件库在术语提取和术语关系提取中都发挥了重要作用，具体来说，体现在以下几方面：① 在多词术语提取中，通过术语部件库获得双词术语和三词术语的常用语义模板，有效的提高了双词和三词术语识别效率；② 在基于模式识别的层次关系提取中，通过部件的语义类别，利用汉语的命名规律，可推导出术语的语义类别，以确定下层术语；③ 在基于概念格的层次关系提取中，依靠术语部件库中对一个术语是否是领域动词或属性词进行判断，领域动词是构成术语内涵的重要元素。3）提出了将中文信息处理中常用的统计+规则的方法用于本体进化技术。本体进化是一项新兴的研究课题，从哪里入手，怎么研究，都还在探索中。本文从术语学的角度出发，提出术语提取和术语层次关系提取是基于数据驱动的本体进化中的两项重要任务，并采用中文信息处理中常用的统计+规则的方法，从语言学视角和分析出发，分别选用了基于语料库比较的方法、互信息、ADTree和FCA数学模型，初步实现了目标。本项研究所积累的资源也是重要的成果，可以对今后的相关的或更进一步的术语研究提供支持。例如，术语部件语义模型、用该语义模型标注的术语部件库以及多词术语语义组合模板，术语部件库的自动、半自动扩展技术，表示偏序关系的语法和语义模式等。所有的资源、技术技术及实验结果都可供未来的研究参考。﹀
外文摘要：	︿ Domain ontology construction and enrichment are one of the hot issues in recent years. From the perspective of philosophy and logic, ontology achievment is a goal of many scientists are searching for since Leribniz time. It is based on the idea that if we have a symbolic system, in which the elements are concepts, categories, then we could know if the sentences composed by the symbols are true or false. Ontology is expected to play an important role in artificial intelligence, and moreover, it has been more popular in digital library, information retrieval etc. Small ontologies make up of the big ontology for the total human knowledge, and domain ontology is one of the most essential and important parts.This paper aims at the research on two key problems of the domain ontology enrichment: automatical term extraction and automatical hierarchy relation extraction, fully examine the linguistic rules related to terms, especially suggests how to cooperate the linguistic rules with the statistical methods, how to use more detailed knowledge to improve the efficiency, and how to select more appropriate linguistic features for machine learning, based on the linguistic background. This not only benefit the research in this paper, but also lays the good basis for other related studies. The creative results achieved in this research are shown as follows:1）The more fine granularity of knowledge and feature selection in machine learning is the same important as the algorithm for artificial intelligence. How to get and apply thin-granularity knowledge, and how to get more efficient linguistic features, could not be ignored in data mining. Based on my linguistics background, this paper put forward the detailed descriptions and classifications for the term mining and hierarchy relation acquiring. This paper firstly presents the syntactic and semantic category of the single-word terms, also firstly presents the semantic templates of two-word and three-word terms, as well as firstly presents the feature and distribution of the hierarchy relation of terms, which provide the basis for further study on term mining and hierarchy relation acquiring.Chapter three and chapter four show the help from the thin-granularity knowledge and chapter five and chapter six show the help of the linguistics for the feature selection.2）Term component semantic model is designed by using the referencing resource SUMO and Hownet, which are general ontologies. According to the term component semantic model, the term component bank is extended by denoting the components’ semantic categories. The extended term component bank function well in term mining and term hyponymy acquiring, practically speaking, embodied in the following aspect:① improve the efficiency of the two-word term and three-word term recognition by the semantic templates;② Deduce the semantic categories of terms from the semantic categories of the term components, taking advantage of the naming principle, to decide the sub-concept in the pattern-based hierarchy relation acquiring. ③ Judge if a word is a verb term or an attribute word in FCA-based hierarchy relation acquiring, and the verb terms or the attribute words show the terms intent.3）Apply the common method “statistics + rules” in Chinese information process to ontology enrichment.Ontology enrichment is a new subject, many problems, such as from where to start, how to start, are still under studying. This paper start from the term’s point of view, regard term mining and hierarchy relation acquiring as one of the most important task for automatic data-driven ontology enrichment, adopting the “statistics + rules” method, starting from the linguistics’ point of view, cooperating the mathematical model ADTree and FCA, hitting the target primarily.The result of this research provides an important resource for term-related item or deeper research in the future. For example, the semantic model of term component, the term component bank denoting by the semantic model and the semantic termplates of the multi-word terms, the automatic or semi-automatic extending technology for term component bank, the syntactic or semantic pattern showing the partial ordering relation. All the resource, models and experiment results could be the reference for the later research. ﹀
分类号：	TP31
论文总页数：	122
参考文献总数：	152
馆藏号：	048/D2006(36)
公开日期：	2006-12-14

2006-06-14

《现代汉语语法信息词典》管理平台的设计开发和地名库建设.王媛媛

链接

题名：	《现代汉语语法信息词典》管理平台的设计开发和地名库建设
姓名：	王媛媛
学号：	10448270
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2006-06-14
外文题名：	The Designing and Implementing the Platform for the Grammatical Knowledge-base of Contemporary Chinese and the Construction of Place-name Knowledge-base
关键词：	自然语言处理现代汉语语法信息词典管理平台地名库
外文关键词：	Natural Language Processing the Grammatical Knowledge-base of Contemporary Chinese Platform Place-name Knowledge-base
论文摘要：	︿本文的研究工作包括两部分：《现代汉语语法信息词典》管理平台的设计与开发和地名库建设。语言知识库作为自然语言处理系统必不可少的组成部分一直受到研究者重视。《现代汉语语法信息词典》便是面向信息处理而研制的电子词典，并且是北大语言所所有知识库资源的第一块基石。在语法词典的研制过程中，我们深感语言知识库建设之艰辛，也认识到辅助工具对语言知识库建设的重要性，于是特别投入力量开发了一系列的辅助工具。本文介绍的语法词典管理平台便是为语法词典建设而开发，它是根据词典建设者的需求而设计，提供了加词、修改、删除、检查、检索等功能，为词典管理者提供了方便、有效的管理平台。借助这个管理工具，我们顺利完成了语法词典由7.3万词向8万词的扩展，而且从各种不同角度提高了词典质量，保证了词典不同层级数据库中的数据一致性。专有名词是自然语言处理中非常重要的一类名词，地名是专有名词的一种。为了识别专有名词，通常会扩大收词规模，语法词典将地名收录在名词库中，若大规模收录地名，必将引起语法词典信息膨胀，造成不必要的冗余。为了尽可能多的收词，又不引起语法词典信息膨胀，将地名从语法词典中分离建成地名库。本文的研究目标是根据地名特性设计地名库结构，并探索人民日报语料中地名属性的发现方法，开发辅助工具，自动构建地名库。﹀
外文摘要：	︿ The work in this thesis includes two parts. One is the platform for the Grammatical Knowledge-base of Contemporary Chinese.The other is the construction of Place-name Knowledge-base based on the People’s Daily Corpus.Natural language processing ultimately requires the support of a powerful knowledge base. The Grammatical Knowledge-base of Contemporary Chinese is an electronic dictionary designed for information processing, and it is the first cornerstone of language data resources of ICL/PKU. During the process of constructing GKB, we realized that the auxiliary tools are very important.The present paper discusses the designing of the platform for GKB and how it is realized in details.With the platform, it is very convenient to manage GKB.The recognition of proper nouns is very difficult in NLP. In order to raise the precison of recognition, a lot of proper nouns are added into language resources. Place names are proper nouns, and they are classed into noun-table in GKB. If place names are added in to GKB as many as possible, the scale of GKB will be too huge.So constructing a place-name knowledge-base is necessary.The object of this research is to design the structure of Place-name Knowledge-base and find an algorithm to retrieve information of place names from the People’s Daily Corpus. The author designed and implemeted an auxiliary tool to construct Place-name Knowledge-base automatically. ﹀
分类号：	TP391
论文总页数：	59
参考文献总数：	14
馆藏号：	048/M2007(214)
公开日期：	2006-06-14

2006-06-11

基于错误驱动的术语间概念关系自动提取技术研究与实现.崔高颖

链接

题名：	基于错误驱动的术语间概念关系自动提取技术研究与实现
姓名：	崔高颖
学号：	10308087
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	穗志方
导师2单位：	信息科学技术学院
论文答辩日期：	2006-06-11
外文题名：	Research & Implementation of Automatic Term Relation Extraction Based on Error Driven Method
关键词：	本体术语间概念关系自动提取错误驱动互信息
外文关键词：	Ontology Automatic Term Relation Extraction Error Driven Mutual Information
论文摘要：	︿ Ontology(本体)作为一种表达领域知识的手段，正在计算机科学的各个研究领域中受到越来越多的重视，并在许多领域得到广泛的应用。从自然语言处理的角度来看，构造领域Ontology的关键技术包括术语自动提取和以此为基础的术语概念关系自动提取。国内外对术语自动提取的研究已经颇具规模，相比之下，对术语概念关系自动提取的研究还处在起步阶段，因此有必要进一步探索术语间概念关系的自动提取方法。本文的研究重点是术语概念关系的自动提取方法，通过对常用术语概念关系自动提取方法的详细分析，确定了基于规则的错误驱动方法的算法设计、原型系统实现和性能改进策略。论文使用基于规则的错误驱动的方法自动获取术语概念关系，在错误驱动的过程中不断调整提取规则以提高关系提取的准确率和召回率，同时自动生成一个新的提取规则库。进而，论文尝试结合统计方法和规则方法对错误驱动的方法做改进，在系统中加入互信息对生成的新规则进行合并和优化。实验结果表明，规则的合并可以提高系统的准确率和综合评价值。最后在对实验结果进行分析的基础上，本文对术语概念关系的自动提取提出了进一步的展望。本文通过对术语间概念关系自动提取方法的分析和实践，提出了一种可行性比较强效果也较好的方案——基于规则的错误驱动的方法，并通过加入统计方法对错误驱动的方法进行了改进，得到了较好的关系提取效果和规则获取结果，在术语间概念关系的自动提取和关系提取规则的自动生成两方面进行了有意义的尝试。﹀
外文摘要：	︿ As an instrument for expressing domain knowledge, Ontology has got more and more recognition from various research fields of computer science and broad application in many fields. From the viewpoint of natural language processing, main technologies on Ontology include automatic term extraction and automatic relation extraction between terms on this base. Research in automatic term extraction has considerable scale now, while automatic relation extraction is on it’s beginning now. So trying new methods to realize automatic relation extraction has practical meaning. This paper focuses on the method of automatically extracting relation between terms. After detailed analyzing common relation extracting methods, we go into particulars of error-driven method’s algorithm and its realization and make improvement in it. Using error-driven method, the system in this paper can keep adjusting our extracting rules to enhance the system’s precision and recall during the process of error-driven. This system can get a new rule base which was generated at the same time. Then the mutual information between two key words is added to realize rules’ combination. By doing this, this paper combines statistical methods with methods based on rules. On the base of analysis of experimental results, we bring forward an expectation of automatically extracting relations between terms Through introducing and analyzing relation extracting methods, this paper put forward a resolve project which is more feasible and has a good effect to automatically extract relations between terms. This method is error-driven method based on rules which is affiliated with mutual information to combine rules. This is a meaningful attempt to automatically extract relations between terms and generate extracting rules. ﹀
分类号：	TP391.1
论文总页数：	62
参考文献总数：	33
馆藏号：	048/M2006(136)
公开日期：	2006-06-11

基于支持向量机的汉语词义消歧研究.幸运

链接

题名：	基于支持向量机的汉语词义消歧研究
姓名：	幸运
学号：	10308170
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2006-06-11
外文题名：	The Research on Chinese Word Sense Disambiguation based on Support Vector Machine
关键词：	自然语言处理词义消歧支持向量机有指导的学习特征选择
外文关键词：	Natural Language Processing Word Sense Disambiguation Support Vector Machine Supervised Learning Feature Selection
论文摘要：	︿词义消歧一直是计算语言学领域的一个重要研究课题，其对机器翻译、信息检索、内容和主题分析、文本分类、语音识别等领域有着重要的影响。本文以北京大学计算语言学研究所开发的较大规模的《人民日报》词义标注语料为基础，从以下几个方面进行了研究：提出一种递减的特征选择算法考察各种上下文知识的组合对有指导词义消歧的影响。实验表明，丰富的上下文知识有利于词义消歧。采用支持向量机方法进行词义消歧，剖析了支持向量机方法的两个重要方面：核函数的选择和多类别支持向量机方法。通过实验表明，相对于其他核函数而言，线性核函数具有训练速度较快，正确率较高的特点。在目前的多类别SVM方法中，一次优化决策的方法训练速度快，易于构造，且消歧效果较好。通过上述研究，本文采用线性核、一次优化决策的多类别支持向量机方法对3个月的《人民日报》语料进行词义消歧，达到了83.82%的正确率。实验也表明使用支持向量机的方法进行词义消歧的效果比最大熵方法好，但是支持向量机方法也有训练速度较慢的缺点。本文还对SENSEVAL-3的中文评测语料进行词义消歧评测，支持向量机方法达到了64.91%的正确率，比最大熵方法提高了2.38个百分点。表明支持向量机方法在小样本情况下具有较明显的优势。﹀
外文摘要：	︿ Word Sense Disambiguation (WSD) is a hot issue in Natural Language Processing. It’s very important to many research fields such as Machine Translation, Information Retrieval, etc. Based on a large-scale People’s Daily Corpus, which is developed by ICL/PKU, this dissertation involves the following research:Firstly, we proposed a digressive feature selection algorithm to analyze the effect of various context-knowledge to supervised WSD. The result shows that richer context knowledge set achieve better result. Secondly, we adopt a supervised learning approach with Support Vector Machine (SVM) for WSD. Two important facets of SVM are analyzed: kernel function and multi-class SVM method. Experiment result shows that the linear kernel function is the most efficient and more accurate than many others. Among four kinds of multi-class SVM methods, the optimization-at-once method is much simpler to construct than the others, and its performance is relatively good.Through the above research, we adopt a multi-class SVM with linear kernel function and optimization-at-one method, and achieve an accuracy of 83.82% on People’s Daily corpus within three months. The experiment demonstrates that SVM is more accurate than Maximum Entropy (ME) method, but it is more time-consuming.Also, we test the SENSEVAL-3 Chinese sample corpus and get an accuracy of 64.91% with SVM, which is 2.38 percent better than ME. This proves that SVM is much better than ME under small-scale corpus. ﹀
分类号：	TP391.1
论文总页数：	51
参考文献总数：	43
馆藏号：	048/M2006(209)
公开日期：	2006-06-11

用户兴趣引导下的网页收集研究.项锟

链接

题名：	用户兴趣引导下的网页收集研究
姓名：	项锟
学号：	10308168
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	孙斌
导师2单位：	信息科学技术学院
论文答辩日期：	2006-06-11
外文题名：	Studies on Web Crawling Conducted by User Interests
关键词：	网页收集搜索引擎主题搜索 HTML 分析网页内容分析
外文关键词：	Web Crawling Search Engine Focused Crawling HTML Parsing Web Content Analysis
论文摘要：	︿随着Internet的普及，搜索引擎成为人们在网络上获取信息的重要方式。但通用搜索引擎无法针对用户兴趣进行个性化的定制。本文提出了用户兴趣引导下的网页收集和服务方式，在网页收集中根据用户兴趣作为网页评分和URL调度的依据，并将收集到的网页按照用户的不同需求进行分发。在本文中，作者阐述了如何处理用户兴趣引导下的网页收集中存在的各种问题，包括HTML分析、网页内容分析、网页相关度评分、URL调度等，并提出了解决方案和改进思路。本文的创新点及主要贡献如下:针对网上大量不规范的HTML文件，根据W3C的标准，设计并实现了具有容错功能的HTML分析器。根据HTML的半结构化特点，提出了用标签合并的方式提取网页正文主体，并在提取出的正文主体部分中定位网页发布时间和网页真实标题。提出了一种链接聚类算法，并将其引入Shark-Search算法，以改善网页采集中的URL调度策略。作者在用户兴趣引导下的网页收集中各种问题研究的基础上，在体育类新闻领域实现了网页收集原型系统，并在本文中提供各个部分的详细设计方案，为进一步的研究提供了实验平台和实验数据。﹀
外文摘要：	︿ With the growth of the Internet, search engine becomes the most important way to get information from the Web. But general search engines can not allow users to customize their query results. In this paper the author proposes a new web crawling method which is conducted by user interests. According to user interests, the system score web pages, schedule URLs during the web crawling process, and finally distribute web pages to individual users. The paper shows how to deal with several problems in this new web crawling way,including HTML parsing,Web page content analysis, scoring pages, scheduling URLs. and proposes some solutions accordingly.The contributions of this dissertation are as follows: Design and implement a HTML parser with the ability of fault tolerance Propose a method for extracting main content of web pages using label merging according to the semi-structured character of HTML files, then locate the real title and released time.Introduce a link clustering approach to the Shark-Search algorithm to improve URL scheduling during the web crawling. Based on the researches on several problems in web crawling conducted by user interests, this paper implements a prototype system in the sport news field. Also it provide full analysis and experiment results for future researches. ﹀
分类号：	TP274
论文总页数：	73
参考文献总数：	21
馆藏号：	048/M2006(207)
公开日期：	2006-06-11

2006-06-09

汉语名词短语隐喻识别研究.王治敏

链接

题名：	汉语名词短语隐喻识别研究
姓名：	王治敏
学号：	10308828
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2006-06-09
外文题名：	Chinese Noun phrase metaphor recognition
关键词：	隐喻识别特征选择最大熵朴素贝叶斯隐喻映射
外文关键词：	metaphor recognition feature selection maximum entropy naï ve bayes metaphor mapping
论文摘要：	︿隐喻是自然语言处理的棘手问题之一，近几年来开始受到从事中文信息处理研究的学者们的关注。隐喻大量地存在于我们的语言生活中，Lakoff &Johnson(1980)指出隐喻不仅仅是语言的修辞手段，而且是人的一种思维方式。如果隐喻的识别和理解不能很好解决，将成为未来自然语言处理技术发展的瓶颈。本项研究面向语言信息处理，全面地考察汉语名词性隐喻的分布，总结和发现名词性隐喻的表达规律，利用机器学习的方法探索短语层级的隐喻识别，为全面的隐喻自动识别和理解奠定基础。在研究过程中取得如下创新成果：（1）提出名词隐喻的层级描写，在语义分类基础上建立以源域（source domain）为核心的名词隐喻知识架构。本文通过考察n＋n名词隐喻在构词-->词汇-->短语-->句子-->篇章等不同层级的分布规律，建立面向文本内容理解的名词“隐喻”的工程定义，确定了面向中文信息处理的隐喻研究重点：即以短语隐喻表达为核心，探索源域到目标域（target domain）的隐喻映射规律。同时从构成、句法、语义等角度对名词隐喻进行考察，建立了汉语名词隐喻的知识架构体系。（2）设计和建造了汉语隐喻知识库，在《中文概念词典》(CCD)上建立源域和目标域的映射关系，增加了CCD关于隐喻映射的描述。汉语隐喻知识库是计算机处理隐喻的重要资源。本文从大规模真实语料中发现隐喻现象，提炼加工了汉语名词隐喻词表，在此基础上又利用《现代汉语语法信息词典》(GKB)和CCD的基础平台，搭建出新的名词隐喻知识库。名词隐喻知识库一方面利用了CCD中概念存储编号的唯一性，通过人工概念消歧，建立了一个源域到多个目标域的映射关系；另一方面名词隐喻知识库的属性字段也继承了GKB的部分成果。（3）提出基于机器学习方法＋规则辅助的汉语名词隐喻识别策略，利用机器学习的分类技术解决隐喻的识别问题。本文把机器学习方法纳入隐喻计算处理的框架，隐喻识别过程被描述成隐喻义与字面义的分类问题，分别对单个词语和“n＋n”模式进行识别实验。单个词语识别充分利用隐喻标注资源和人工归纳的语言知识，通过实例方法、最大熵方法和朴素贝叶斯方法的隐喻建模，在综合上下文词语、词性等多项特征的基础上，进行了三种模型不同窗口的比较实验，最后确定最大熵模型为理想模型，然后再引入多项辅助特征来提高识别效果。“n＋n”模式识别建立在单个词语实验的基础之上，实验过程重在建立隐喻相似度推理，同时也验证了名词隐喻知识库的有效性。（4）结合CCD和隐喻知识库建立汉语名词隐喻扩展推理，进一步提高识别效果。为了能够更好地建立隐喻的相似度推理，本文运用人机互助方法对CCD词典进行了合理剪裁，建立了一个词语对应一个语义类的词典格式，为后续的相似度实验提供了保证。本项研究所积累的资源也是重要的成果，可以对今后的汉语隐喻计算研究提供支持。例如，实验所用的各种统计软件都可以作为隐喻自动识别的工具；汉语名词隐喻词表作为基础资源，为隐喻的计算理解提供了有价值的数据；汉语隐喻知识库中源域和目标域的概念映射为人们提供了一组组清晰的汉语隐喻映射图画；新闻领域和文学领域的有一定规模的名词隐喻标注语料库，为计算机的隐喻识别和理解提供了重要参考。﹀
外文摘要：	︿ Metaphors as one of the more intractable problems have attracted a significant amount attention from researchers. There are lots of metaphors in our daily life. According to Lakoff and Johnson (1980), metaphors are not only one of the rhetorical means of language, but also a way of thinking. If metaphor recognition and understanding can not be solved smoothly, they will be the bottle-neck in the development of the technology of Natural Language Processing.This paper aims to fully examine the distribution of noun metaphor in Chinese, find the relevant rules and explore metaphor recognition at the phrase level through the methods of machine learning. This lays the basis for large scale metaphor automatic recognition. The creative results achieved in this research are shown as follows: (1)Put forward the hierarchical description for the noun metaphor and set up the noun metaphor knowledge framework within the core of source domain based on the semantic taxonomy.Through examining the distribution rule of noun phrase metaphor in the different levels ranging from vocabulary, phrases, sentences and paragraphs, the project definition of noun “metaphors” in the text is established and the research emphasis for metaphors used in Chinese Information Processing is determined. The research emphasis is to explore the metaphor mapping rule from the source to target domains with the core of metaphorical expressions located found at the phrase level. Noun metaphors are described from the angles of structure, syntax and semantics, which help to set up the knowledge framework of noun metaphor in the Chinese language. (2)Design and create a Chinese metaphor knowledge base, to map the relations between the source and target domains on the CCD and explore increasingly the metaphor mapping of the CCD resource.A Chinese metaphorical knowledge base is an important resource to deal with the metaphor by the computer. This paper introduces and summarizes the metaphorical expressions in modern Chinese which were found in the large real corpus. Based on the basic platform of GKB and CCD, A new Chinese metaphorical knowledge base is established. This utilizes the uniqueness of the storage number in the concept of the CCD and builds on the mapping relations from a source domain to many target domains. At the same time, the description specification of Chinese metaphor knowledge base also inherits some attributive characters of GKB.(3)Put forward the metaphorical recognition strategies for the Chinese noun based on the methods of machine learning and aided rules to try to solve the problem of metaphor recognition through the classification technology of machine learning.This paper tries to introduce machine learning methods into the framework of metaphor processing, in which the metaphor recognition process is described as the classification between metaphor and their literal meaning. Two experiments of metaphor recognition for single word and model recognition for “n+n” were conducted. The metaphor recognition for the single word fully utilizes the metaphor-tagged resource and linguistic knowledge introduced by people. The comparative experiment of three types of models with the same window is made with the Example, Maximum Entropy and Bays methods and is based on the contextual words and their categories.This determined the method for maximum entropy is the ideal model and increased the recognition effect by introducing many added features. The experiment of the model recognition for “n+n” is established on the basis of the experiment using a single word. This emphasizes how to create a “similarity inference”for the metaphors in the process of the experiment. At the same time, the efficiency on the noun metaphor knowledge base has been proved. (4)-- Establish the inference mechanism for Chinese noun metaphor, Combined CCD concept dictionary and metaphor knowledge base, from which recognition performance is further improved. In order to set up a good similarity inference, a dictionary form where every word matches the semantic classification is established by tailoring the CCD concept through the method of man-machine complementary technique. This provides a base for the successive experiments.The result of this research provides important resources for metaphor recognition and understanding. For example, the statistical softwares we used are good tools for construcition of Chinese metaphor knowledge base. Chinese noun metaphors list can be used directly for the metaphor recognition and the concept mapping between the source domain and target domain in the Chinese metaphor knowledge base also provides clear mapping pictures of Chinese noun metaphors.Furthermore, the metaphor-tagged corpus for the certain scale Chinese noun found in the news and literature are established in order to offer the necessary assurance for the model training of metaphor recognition. ﹀
分类号：	H087
论文总页数：	0
参考文献总数：	0
馆藏号：	048/D2006(28)
公开日期：	2006-06-09

面向中文专著的汉韩机器辅助翻译研究.姜柄圭

链接

题名：	面向中文专著的汉韩机器辅助翻译研究
姓名：	姜柄圭
学号：	10208859
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2006-06-09
外文题名：	A study on the Chinese-Korean Computer-Aided translation for Chinese Monographs
关键词：	中文专著汉韩机器辅助翻译汉韩语言对比研究术语辅助提取与翻译模块翻译记忆模块隐喻翻译研究
外文关键词：	Chinese Monograph Chinese-Korean Computer Aided Translation contrastive analysis Term Translation System Translation Memory System Metaphor
论文摘要：	︿本研究详细探讨了机器辅助翻译工作模式在中文专著的韩文翻译过程中所起的作用。有别于以往单纯面向人或机器的翻译研究，这次研究工作主要是想让人和机器相辅相成协调工作。本文研究工作的主要成果可以归结为以下六个方面：第一, 本文通过对中文专著的定量、定性分析，归纳了中文学术专著的语言特点。这些语言特点直接为汉韩机器辅助翻译系统的设计提供了切实的依据。第二, 本文在对比语言学理论指导下，系统地考察了汉韩两种语言在词汇、句子层面上的异同点。以往的汉韩对比研究主要针对文学作品的翻译，而本文针对中文学术专著的韩文翻译，加强了学术领域的语言对比研究。第三,在专著语言特点分析以及汉韩语言对比的基础上，参照原有的汉英机器辅助翻译模型，分析其中存在的问题，本文提出了一种改进的汉韩机器辅助翻译模型。该模型的特点可以概括为：采取Unicode编码方式的系统设计；提高术语辅助翻译工具的自动化水平；提高翻译记忆系统的利用率，从而实现以小句为转换单位的翻译记忆系统。第四,本文提出了面向专著的汉语术语自动提取与韩国语辅助翻译方法。本文工作采取了“以统计方法为主，规则方法为辅”的策略。结果表明，统计和规则相结合的方法大幅度提高了术语提取以及术语翻译的效率。第五, 本文提出了面向中文专著的汉韩翻译记忆模型。为实现高效率的翻译记忆模块，本文提出了小句一级的处理方法。另外，又将短语一级的翻译模板用于中文专著的韩文翻译工作。为建设翻译模板库，本文使用串频统计方法自动提取重复出现的短语。实验结果表明，中文专著虽然在句子级的重复率不高，但是在小句或短语的重复率相当高，因此，这种方法对中文专著的翻译提供了有力的帮助。第六,本文以中文专著《现代汉语语法信息词典详解》为例，专门考察了汉语的隐喻现象和韩国语翻译问题。学术性语言中也有不少的隐喻现象，包括词汇级隐喻和语句级隐喻。从语言对比的角度看，汉韩两种语言中的隐喻表达方式不仅相同。对此，本文一面进行了详细描述，一面提出了一个隐喻翻译的策略。其中，改进的汉韩机器辅助翻译模型、汉语术语自动提取与汉韩辅助翻译、基于小句的汉韩翻译记忆模型以及专著隐喻翻译策略这几项成果是有富有创新性的。虽然距离智能化的机器辅助翻译系统的最终建成还有很长的路要走，但是这些研究成果能够昭示面向受限领域的机器辅助翻译研究，特别是面向学术专著的汉韩机器辅助翻译研究的成果将产生积极的社会效益和经济效益。﹀
外文摘要：	︿ The normal practice for monograph translation is very time-consuming and difficult. Addressing the problems currently plaguing monograph translation in light of the unique features of Chinese scholarly works, this thesis calls attention to a new translation method supported by computer technology. Computer-Aided Translation (also known as Machine Aided Human Translation) is a method of translation that requires some interaction between the translator and the computer. Computer-Aided Translation is, in many ways, a quick solution to the needs of many human translators who have to handle the task of translating monographs. A monograph is a book which is a detailed study of the specific subject. In the manual, or traditional, translation of the monograph, the workload and quality requirements are very high, while its process is quite monotonous. The efficiency of translation can be improved with the help of computer technology, since there are many fixed expressions (including terms, phrases and clauses) which are repeatedly used in the monograph text. We are contemplating, in this thesis, situations where computational support can be sought to translate monographs with maximum efficiency and quality, with a focus on the Computer-Aided Translation (CAT) model for translating Chinese monographs into Korean. The contributions of this study mainly include the following:(1)The linguistic features of Chinese monographs are analyzed systemically on a statistical frequency basis. Chinese monographs have their own, unique characteristics of writing style and form, as well as shared features with other domains and genres. These linguistic features are a major consideration in building the CAT system.(2)According to the contrastive linguistic theory, we investigate the contrastive characteristics of lexical meanings and syntactic structures between Chinese and Korean. Our contrastive analysis is, however, focused on the monograph translation, as opposed to other genres, such as literature translation.Contrastive knowledge is indispensable to build Chinese-Korean CAT system. The similarities and differences between two languages are thoroughly considered at every stage of system design and construction.(3)Drawing from a survey on the linguistic features of Chinese monographs, we will propose the most suitable Chinese-Korean CAT model. An existing Chinese-English CAT system will provide a common frame of reference during the construction of the translation system. Unlike the existing Chinese-English CAT system, our system has some improved functions, such as, Unicode-based system which can handle the multilingual documents; Term management system which improves the level of automation for translating technical terms into Korean; Translation Memory system which can find similar translation units of the clause level. (4)We suggest a methodology which is aimed to extract and translate Chinese terms semi-automatically. In order to extract multiword terms from monograph documents, we use a hybrid method which combines the statistical method and linguistic rules. And for the translation of Chinese terms into Korean, we use the existing linguistic resources and statistical measures. We believe that Chinese terms extracted and translated in this way, could be used effectively to supplement existing terminological collections.(5)Translation memory system is suggested for the Chinese monograph translation. The basic idea of translation memory is to help translators deal with the translation units that are repeated or reoccur in different variations throughout the document. In our work, we try to segment the long sentences into several clauses or phrases in order to improve the matching rates. Some translation patterns are built based on the statistical method of n-gram chunking. And the similarity measurement for Chinese monograph is made not by the sentence level, but by the clause level. Finally we demonstrate the effectiveness of our approach by showing a high degree of matching evaluation.(6)We discuss how many metaphorical expressions are used in pure academic texts, which is a field of study quite unexplored up to date. An examination of Chinese monographs shows that there are many metaphors used in academic texts as well as the literature one, varying from lexical patterns to sentence patterns. Apart from their use in basic linguistic communication, metaphorical models play an important part in communicating new discoveries in scientific theories. In consideration of the existence of some differences in the metaphorical expressions between Chinese and Korean, we suggest building a database for metaphorical expressions in Chinese and their Korean equivalents, in order to assist the translation of metaphors in scientific documents.This study is intended to help translators to translate Chinese monograph into Korean. The above-mentioned achievements are also applicable to the practical translation tasks. Moreover this study can be considered as a part of our ultimate goal “Multilingual (Chinese-English- Korean-Japanese) CAT system for specific domains”. Although this study is a good start to the development of the Chinese-Korean CAT system, there is still a long way to go for building more intellectualized multilingual CAT systems. ﹀
分类号：	H085
论文总页数：	169
参考文献总数：	145
馆藏号：	048/D2006(16)
公开日期：	2006-06-09

2005-06-10

统计与规则相结合的粗粒度词义消歧软件的设计与实现.温珍珊

链接

题名：	统计与规则相结合的粗粒度词义消歧软件的设计与实现
姓名：	温珍珊
学号：	10208100
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2005-06-10
外文题名：	Coarse-grained WSD Based on Statistical and Rule Approaches
关键词：	词义消歧基于规则基于实例朴素贝叶斯最大熵统计方法特征选取投票模型错误驱动混合模型
外文关键词：	Word Sense Disambiguation (WSD) Rule-based Instance-based Naï ve Bayes Maximum Entropy Statistical Approaches Feature Selection Voting System Error-Driven Approach Hybrid Model
论文摘要：	︿词义消歧(Word Senses Disambiguation, WSD )是当前自然语言处理中的一个热点也是难点。本文对词义消歧问题进行了研究，以《现代汉语语法信息词典》中的“同形多义词”为研究对象，以《人民日报》基本标注语料库为研究素材，设计并实现一个统计与规则方法相结合的词义消歧软件。本文首先提出并回答了关于词义消歧的三个问题，即词义消歧是什么、有什么用、怎么做，并在综述一章中概要介绍了各种不同的词义消歧方法。接着本文从整体上介绍了《人民日报》基本标注语料库的“粗粒度词义消歧”研究工作，包括研究目标、用到的各种知识资源（包括词典、语料以及上下文中包含的各种对词义消歧有帮助的信息）以及大致的试验过程。为了解决《人民日报》基本标注语料中的“同形多义词”义项的自动标注问题，本文设计并实现了一个词义消歧软件，其中包含基于规则的方法、基于实例的方法、朴素贝叶斯（Naïve Bayes）以及最大熵方法的实现，并运用基于转换的错误驱动的混合模型将这几种消歧方法结合起来。试验结果显示，各单一模型的消歧效果与baseline相比有了显著的提高，混合模型的效果与单一模型相比又有进一步的提高。﹀
外文摘要：	︿ Word Sense Disambiguation (abbr. WSD) is a popular concern in Natural Language Processing (abbr. NLP). The issue is investigated in the dissertation and a WSD sofware is designed and implemented, which is target on the “multi-sense word”s from “The Grammatical Knowledge-based of Contemporary Chinese”and finally tested on the corpus on “The Peoples’ Daily Corpora”.In this dissertation, three questions of WSD are firstly raised and answered, i.e.“what is WSD”, “why we need WSD”and “how to perform WSD”. In Chapter 2, various kinds of approaches in WSD are introduced.In Chapter 3, general information of the work on coarse-grained WSD is describled, including the research objective, various knowledge resources used (dictionary, corpus and various information helpful to WSD contained in the context, etc.) and the testing process. To solve the auto-tag issue of “multisense word”in Peoples’ Daily corpus, a system is designed and implemented based on rules, instances, Naïve Bayes and Maximum Entropy approaches together with several hybrid models. The test result proved that the effects of single models are all advanced than the baseline method, and the effects of hybrid models are advanced than the single models. ﹀
分类号：	TP391.1
论文总页数：	76
参考文献总数：	29
馆藏号：	048/M2005(111)
公开日期：	2005-06-10

中文搜索结果的在线层次聚类技术.赖治国

链接

题名：	中文搜索结果的在线层次聚类技术
姓名：	赖治国
学号：	10208064
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	北京大学计算语言所
导师2姓名：	孙斌
导师2单位：	北京大学计算语言所
论文答辩日期：	2005-06-10
外文题名：	The Technology of the Chinese Search Result's Online Hierarchical Clustering
关键词：	搜索引擎搜索结果聚类潜在语义索引短语提取
外文关键词：	Search Engine Search Results Clustering Latent Semantic Analysis Phrase discovery
论文摘要：	︿随着WEB信息地迅猛的增加，Internet越来越普及，Web已经成为世界上最大的数据资源，各种不同的应用都在使用Web数据资源。搜索引擎是目前从Web上查找信息的最常用工具，搜索引擎会根据用户的查询词快速有效的搜索出用户所需要的信息。然而，搜索引擎目前的技术还不能完全满足使用者的需求。我们认为将搜索的结果作适当聚类会很好的帮助用户搜索Web资源。在本文中，我们改进了一种基于语义的、层次的、以集簇标签为向导对搜索引擎返回的结果进行层次聚类的算法。算法的主要思想是首先推导出可以表示集簇的集簇标签，然后在这些集簇标签的基础上，将摘要分配到不同的集簇中。在本文中，我们展示了如何应用潜在语义分析技术来提取摘要集合中的主要概念作为集簇标签。在这个过程中，我们讨论了几个影响集簇标签提取质量的因素，例如搜索结果的预处理方法和基于词的短语提取方法。为了评价算法的聚类效果，我们采用了一个基于人工判断标准的评估指标。本文有如下创新贡献：讨论了词组作为集簇标签的优越性，并提出了一种利用后缀数组在切分的基础上提取词组的算法，该算法性能和效果都不错。讨论了一个利用短语来表示从摘要集合中提取出的抽象概念的方法。同时，讨论了一个将摘要分配到集簇中的方法。本文还提出了一个基于词频或者短语频频率以用户需求为导向的层次聚类方法。本文针对中文的特点，设计了一种针对中文搜索结果进行层次聚类的方法。本文设计并实现了一个原型系统，并进行了实验；最后对实验数据结果作了一定的分析。实验和分析表明本文提出了一种对中文搜索结果进行聚类的可行的思路，其方法与进一步的应用研究还有待于深化和完善。﹀
外文摘要：	︿ With information explosion on the Web as well as popularity of the Internet, the Web has already become the biggest data source for various applications. Web search engine is the most commonly used tool to obtain information from the Web effectively and efficiently according to user’s query request on the web. However, its current status is far from satisfaction for the user’s request. We think that clustering of Web Search results could help a lot. In this paper we propose a semantic, hierarchical, online, description-oriented algorithm for automatically for clustering of results obtained from Web search engine. The key idea of our method is to first discover meaningful cluster labels and then, based on the labels, determine the actual content of the groups. We show how the cluster label discovery can be accomplished with the use of the Latent Semantic Analysis technique. We also discuss several factors that influence the quality of cluster description, such as input data preprocessing and phrase discovery. To evaluate the practical performance of our algorithm apply an expert- based scheme for assessment of clustering results.The main contributions of this paper include the following:The benefits of using phrase as cluster label are discussed. An effective and efficient algorithm base on suffix array and term syncopation for phrase discovery is presented.An algorithm to discovery cluster label based on conception is discussed. Meanwhile, an algorithm to assign cluster snippet to cluster is discussed.This paper proposes a method to hierarchical cluster based on term or phrase frequency.Finally，this paper’s method to cluster search result is for Chinese language and take character of Chinese language into account.A demo system was designed and developed. Experiments on this system were finished and the results were analyzed.Both experiment and theory show that the method this paper proposes may be applied to cluster the search result of Chinese language ，but there are more unknown improvement about the methods and its applications left. ﹀
分类号：	TP391.1
论文总页数：	68
参考文献总数：	40
馆藏号：	048/M2005(083)
公开日期：	2005-06-10

中文术语自动提取技术研究.谌贻荣

链接

题名：	中文术语自动提取技术研究
姓名：	谌贻荣
学号：	10208039
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	北京大学计算语言学研究所
导师2姓名：	穗志方
导师2单位：	北京大学计算语言学研究所
论文答辩日期：	2005-06-10
外文题名：	The Research on Automatic Chinese Term Extraction
关键词：	术语自动提取单元度术语度
外文关键词：	Automatic Term Extraction Unithood Termhood
论文摘要：	︿当今社会，术语在各个领域中层出不穷，自动术语提取日益受到人们的关注。术语是为有效表达领域知识而产生的词语单元，其计算至少分为单元度（指一个符号串作为词语出现的可能性的度量）的计算和领域性的计算两方面。本论文重点研究单元度的计算方法，该方法不仅适用于术语提取，对于新词、新语的获取也同样适合。关于单元度的计算方法，常见的方法有互信息、开方分布等，这些方法所使用的统计信息一般包括词串同现信息以及各个子串的出现概率信息。然而，在实际应用中发现，这些信息对于准确计算词串的单元度还是不够的。本论文提出了一种基于多特征的单元度计算公式。该方法在考虑词串同现信息和各个子串的出现概率的基础上，又增加了各个子串的边界变化特征信息。在具体计算时，将计算给定符号串的单元度的问题看作是计算从上下文中分割出当前符号串的概率问题，通过对符号串中的每两个连续符号之间的连接性和符号串两端的独立性的计算，得到单元度的计算值。词语单元的特性一般表现为结合紧密，使用稳定。常见的词语提取方法集中于计算候选词语的结合紧密性，而本论文提出的方法通过全面考虑候选词语边界和内部的信息，在考虑计算结合紧密性的同时，还考虑了使用稳定性。从而使词串的单元度计算更加完整可信。实验表明该算法的效果好于当前常见的词语提取方法。本论文对于术语的领域性的计算也进行了相关研究，在此基础上可以进一步从已分类的语料中抽取领域相关的新术语，从而为建立更加完整的术语领域知识体系做好准备。关于术语领域性的计算，本论文分析比较了传统的几种方法，在此基础上形成自己的领域性计算公式。通过领域性的计算，在单元度计算结果的基础上按领域分类和过滤，从而为最后的人工整理提供高质量的候选术语库。作为汉语语言计算中的一种基本算法，汉语术语的自动提取算法将在术语标准化、词典编撰、自动分词、新词语的发现和领域知识的获取等应用中发挥巨大的作用。﹀
外文摘要：	︿ Nowadays, new terms grow out in every domain which still tends to be divided, so automatic term extraction is getting more and more attention day by day. Term is the lexical unit for effectively representing domain knowledge. Algorithms for automatic term extraction include at least the computing for the domain feature of a term and the computing for unithood measurement.Here we focus on the unithood measurement which is very important not only in term extraction but also in new words acquisition. Common algorithms including mutual Information, x2, etc, in which some limited statistical information on the occurrence probability of the whole unit and its every element is mainly used. But in practice, we found that it is far from enough for computing the unithood of a lexical string precisely. In this paper, we propose a new method which utilizes multi-feature. Besides the occurrence probability, the method also uses the marginal variety probability of the whole string and its every element. In detail, the method gets the unithood for a specified string by computing the correct probability of the segmentation which segments the string from its context into a separate unit. And the correct probability is approximately equals to multiplication of stable probability of each point including two marginal points of the string and all connecting point between every pair of adjacent elements. A lexical unit must be united tightly and used steadily. Methods in common use focus on the computing of inner unity, we focus on both by adopting more information including marginal and inner information which makes the result much more reliable. Some experiments have proved that our method performed better than other method in common use.The algorithm of termhood will extract information from classified corpus to form a more mature system of domain knowledge. Here we compared several common used methods and the policy of feature selection which then be modified to use in our system. The computing of domain hood filters garbage word, classifies terms and provide a high quality term candidate list for human revising.The algorithm of Chinese automatic term extraction is the basic algorithm in Chinese computing. It will works great in term standardization, lexicon construction, word segmentation, new word recognition and acquisition of domain knowledge. ﹀
分类号：	TP391.1
论文总页数：	79
参考文献总数：	22
馆藏号：	048/M2005(064)
公开日期：	2005-06-10

2005-06-09

论汉英动结式.和平

链接

题名：	论汉英动结式
姓名：	和平
学号：	10239013
论文语种：	eng
专业：	英语语言文学
公开时间：	公开
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2005-06-09
关键词：	动结式 VP壳生成语法
论文摘要：	︿本文采用了生成语法中的支配和制约（GB）理论，对英语和汉语中的动结式句子进行了比较研究。全文包括四个部分。第一部分为绪论，从总体上介绍了英语动结式和汉语中动结式句子的构成。第二部分讨论了英语动结式的语义限制和句法限制—语义方面，英语动结式中的动词和表示结果状态的形容词短语或介词短语之间的搭配有一定的限制条件；句法方面，英语动结式受到“直接宾语限制条件”的制约。第三部分分析了汉语动结式违反 Grimshaw 的论旨阶层的现象，将其归结为句法上VP壳的作用结果。最后，由以上分析得出结论，即英语和汉语的动结式句子的各种表现都是句法作用的结果。﹀
分类号：	H314
论文总页数：	42
参考文献总数：	34
馆藏号：	039/M2005(14)
公开日期：	2005-06-09

2005-06-08

数学基础研究与转换生成语法的发展.沙陶金

链接

题名：	数学基础研究与转换生成语法的发展
姓名：	沙陶金
学号：	10239025
论文语种：	eng
专业：	英语语言文学
公开时间：	公开
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
导师1姓名：	何卫
导师1单位：	外国语学院
论文答辩日期：	2005-06-08
外文题名：	Foundational Studies in Mathematics and the Development of TGG
关键词：	转换生成语法乔姆斯基数学形式主义直觉主义逻辑主义
外文关键词：	TGG Chomsky Mathematics Formalism Intuitionism Logicism
论文摘要：	︿乔姆斯基创立并发展了转换生成语法，将语言学研究提升到了自然科学的地位，开拓了语言科学研究的全新视角。二十世纪初关于数学基础的大讨论极大地促进了形式科学的发展，其成果为转换生成语法提供了重要的思想源泉和有力的理论工具。本文旨在考察主要数学哲学流派的思想方法对转换生成语法所产生的一系列影响，从而更深刻地认识这一语法理论的基础和本质。全文共分六个章节：第一章简要回顾了现代语言学和自然科学的关系，并简述了转换生成语法的核心思想和发展轨迹。第二章介绍了数学基础研究的三个主要学派及其主张。之后的三章是本文的中心部分，深入讨论了形式主义、直觉主义和逻辑主义因素对乔姆斯基语言思想的影响及其在转换生成语法各个时期理论模型中的具体应用。最后一章对论文的局限性和相关的问题做了解释，并进一步阐明了从数理科学的角度研究和发展转换生成语法的前景和意义。﹀
外文摘要：	︿ This paper is devoted to exploring the influence of mathematical science on the development of Chomsky's Transformational-Generative Grammar (TGG). It investigates in what particular aspects mathematics has shed light on the construction of TGG, focusing on the methodological analogy between the foundational studies in mathematics (i.e. Formalism, Intuitionism and Logicism) and the different theoretic models of TGG as well as their ontological concurrency.The paper starts with a broad-brush review of the history and theoretical background of TGG. Chapter 2 outlines the three major approaches to the foundational crisis in the history of mathematics. The next three chapters are the central sections concentrating on an in-depth analysis. Chapter 3 examines Hilbertian formalism with focus on the proof theory, recursive theory and form preoccupation; Chapter 4 mainly discusses Chomsky’s shift from formalism to intuitionism as reflected in several major revisions on TGG; Chapter 5 studies the Fregean logicism and the application of mathematical logic in TGG, focusing particularly on the adoption of function in the GB model. The paper concludes with a brief summary and concise explanation together with some tentative suggestions. ﹀
分类号：	H04
论文总页数：	61
参考文献总数：	50
馆藏号：	039/M2005(24)
公开日期：	2005-06-08

2004-05-23

汉英机器翻译若干关键技术研究.刘群

链接

题名：	汉英机器翻译若干关键技术研究
姓名：	刘群
学号：	19908835
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2004-05-23
外文题名：	Researches into some key aspects of Chinese-English machine translation
关键词：	汉英机器翻译汉语词法分析词义相似度计算双语短语对齐翻译模板的自动抽取多引擎机器翻译方法
外文关键词：	Chinese-English Machine Translation Chinese Morphological Analysis Lexical Semantic Similarity Computing Phrase Alignment of Parallel Corpus Translation Template Extraction Multi-Engine Machine Translation
论文摘要：	︿虽然机器翻译离人们的希望还有很大的距离，不过近年来统计机器翻译技术的一些进展使很多研究者相信，在现有的计算条件下通过研究方法的改进，机器翻译的水平还有较大的提高空间。作者认为，充分利用人类专家知识库、基于大规模语料库获取语言和翻译知识、建立反映语言深层结构对应关系的统计翻译模型是通向高质量机器翻译的有效途径。本文的研究工作就反映作者在这个方向上进行的一系列努力。本文主要围绕汉英机器翻译中的一些关键技术展开研究。具体来说，本文在以下方面做出了有创新性的工作： 1．提出了一种基于层叠隐马尔可夫模型的汉语词法分析算法。这个算法由多个层叠的隐马尔可夫模型构成，粗切分采用基于N最短路径的算法，简单未定义词和复合未定义词采用基于角色的隐马尔可夫模型识别新词，并采用基于角色的词语生成模型估计未定义词的概率；细切分采用词汇化的隐马尔可夫模型；词性标注采用基于词性的隐马尔可夫模型；多种模型紧密结合，下层模型不仅提供多个最好的分析结果供高层模型使用，而且也给出了这些结果的概率。模型之间环环相扣，互为补充，最终达到整体结果的最优化，同时保持算法的高效率（线性时间复杂度）。 2．提出了一种基于《知网》的词汇语义相似度计算模型。这种方法充分利用了《知网》中所包含的丰富的人类语言学知识，直接计算两个词语的语义相似度，而无需通过大规模语料库的训练，方法简单有效。这种方法可广泛用于词义排歧、基于实例的机器翻译等多种领域。 3．提出了一种高效的双语短语对齐搜索算法。这种算法的主要优点是可以尽可能避免词语对齐错误给短语对齐带来的干扰，使得短语对齐的正确率和召回率比词语对齐的相应指标都要高出很多，效果很好。算法采用柱形搜索策略，时间消耗随着句子长度线性增长，效率也非常高。 4．定义了一种可以刻画两种语言深层句法结构对应关系的短语结构转换模板，并给出了从双语短语对齐的语料库中抽取这种模板的算法。对实验结果的初步分析表明，从一个八千句子对的短语对齐语料库中抽取出来的模板，已经可以覆盖各种常见的汉英句法结构的转换模式。 5．提出了一种微引擎流水线机器翻译系统结构。在这种结构下，整个机器翻译过程被分解成若干个串行的阶段，每个阶段可以有若干个功能相似的部件（微引擎）同时工作。通过添加和删除微引擎以及调整流水线的结构很容易实现各种机器翻译构件的协调工作，而无需修改系统的总体翻译算法和数据结构，有利于提高机器翻译系统的开发效率以及尝试新的机器翻译方法。文中介绍了一个基于这种结构实现的面向新闻领域的汉英机器翻译系统，并给出了实验结果。﹀
外文摘要：	︿ Current machine translation quality is far from user’s expectation. However, recent advances on statistical machine translation make many researchers believe that there is a fairly big space to improve the quality of machine translation by improving the approaches we used. The author suggest that adequate utilization of human-made knowledge bases, language and translation knowledge acquisition from large scale corpus and construction of statistical translation model which can capture the correspondence between the deep structures of the source and target languages, are the proper way to achieve a high quality machine translation. This paper presents some researches that have been done by the author following this direction. All the researches present in this paper are on some key technologies of Chinese-English machine translation. Specifically, these researches include: 1．An algorithm for Chinese morphological analysis based on Cascaded Hidden Markov Model (Cascaded HMM) is proposed. In this model, multiple layers of HMMs are used to resolve various morphological problems separately. Low level HMMs not only provide candidates to high level HMMs, but also provide the probabilities of these candidates. All the models are integrated in a whole framework to achieve a best result, while the time cost of the algorithm is still linear. 2．An algorithm for semantic similarity computing between Chinese words based on Hownet is proposed. This algorithm utilizes the rich human knowledge in Hownet to compute the semantic similarity between two Chinese words directly, without any data training from large corpus. 3．A efficient search algorithm for phrase alignment in parallel corpus by tree-tree mapping is proposed. This algorithm can avoid the spreading of word alignment errors in to phrase alignment. So the precision and recall of phrase alignment are much better than those of word alignment. A beam search strategy is used in the algorithm, and the time cost is linear in terms of the length of sentence. 4．A Phrase Structure Transduction Template (PSTT) is defined, which can capture the correspondence of the syntax structures of source and target languages, and a extraction algorithm of PSTTs from phrase aligned parallel corpus is given. A primary analysis of the experiment results show that the PSTTs extracted from a phrase aligned corpus containing 8009 sentence pairs can cover most of the frequently used transform patterns between Chinese and English syntax structure. 5．A micro-engine pipeline machine translation architecture is proposed. In such an architecture several components with different algorithms are used in each phases of the system. New approach can be test under the architecture by adding a new micro-engine or adjusting the pipe structure, without changing the overall algorithm of the system. ﹀
分类号：	H085
论文总页数：	128
参考文献总数：	128
馆藏号：	048/D2004(02)
公开日期：	2004-05-23

2004-05-20

面向中文学术专著的机器辅助翻译研究.柏晓静

链接

题名：	面向中文学术专著的机器辅助翻译研究
姓名：	柏晓静
学号：	10108833
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2004-05-20
外文题名：	A study on the computer aided translation for Chinese academic monographs
关键词：	中文学术专著机器辅助翻译语句相似度计算汉英对比研究平行语料库
外文关键词：	Chinese Academic Monograph Computer Aided Translation Sentence Similarity Measurement Chinese-English Contrastive Study Parallel Corpus
论文摘要：	︿随着我国科学技术的发展以及国际合作的加强，中文学术专著的“出口”需求日益增大，但是，目前专著翻译的主流模式还不能有效地推动中文学术专著的国际化。机器辅助翻译的出现为自然语言处理研究注入了新的生机，然而，现有机器辅助翻译工具的针对性还比较差，能够为翻译人员提供的帮助十分有限。本文探讨中文学术专著的机器辅助翻译问题，尝试为中文学术专著的英文版翻译这一类工作周期较长、工作量极大、质量要求极高的翻译任务建立一个实用而有效的机器辅助翻译模型。由于设定了具体的应用领域、充分考虑了特定领域中的语言特点和特殊需求，该模型能够使翻译人员和计算机在专著翻译过程中更好地实现优势互补，从而保证专著翻译工作的效率和质量。围绕该模型的建立，本文在计算机辅助环境的设计、语言资源的建设方法、中文学术专著中的语句相似度计算、汉英对比研究在机器辅助翻译中的应用等方面进行了探索。本文的研究工作主要取得了以下五方面的成果： 1. 以北京大学俞士汶教授等编著的《现代汉语语法信息词典详解》一书的导引部分作为专著的蓝本，以该书的英文版翻译过程作为参考实例，为中文学术专著的机器辅助翻译过程建立了一个模型。该模型细化了专著翻译过程中人与计算机的分工与协作，并能支持过程的重用，对于提高中文学术专著翻译工作的自动化程度有重要意义。 2. 设计并参与开发了一个面向中文学术专著的机器辅助翻译原型系统，从而实现了中文学术专著机器辅助翻译模型的部分思想。较强的领域针对性保证了系统在解决实际问题时的有效性，使其表现出通用机器辅助翻译系统所不具备的优势。原型系统在实际应用中为建模思想的优化提供了有价值的参考信息。 3. 提出了一种大规模双语平行语料库的构建方法，并用该方法指导了系统记忆库的建设，为中文学术专著机器辅助翻译系统准备了一项重要的语言资源。记忆库构建方法对平行语料中的双语知识进行了合理的形式化表示，从而为计算机辅助知识获取提供了直接的支持。 4. 根据中文学术专著的语言特点、专著翻译的特殊需求以及自然语言处理技术的现状，提出了一种适合中文学术专著的语句相似度计算方法，从而更好地揭示了专著中语句间的相似性。该方法应用到原型系统的翻译记忆模块中，获得了较好的效果。 5. 总结了汉语“被”字结构和英语被动语态不对应时的词汇语义特征和句法结构特征，研究结果将为翻译记忆模块处理被动范畴表述问题提供支持。在机器翻译研究中，采用受限语言和允许人的介入被认为是提高译文质量的两条途径。相似地，本文的研究工作实质上也是在挖掘受限领域和人机协作这两方面的潜力。本文课题是针对一个很小应用领域提出的，其研究成果能够直接应用于中文学术专著的英文版翻译工作，而相关的思想、方法和资源也可以服务于其他的领域，从而为不同的翻译任务建立相应的机器辅助翻译模型，开展相关的系统设计和资源建设工作以及自然语言处理研究和语言研究。本文课题是一项涉及计算语言学、语言学与翻译学的综合性研究，同时也是对学术专著机器辅助翻译问题的一次初探。本文研究工作的主要创新点在于： 1. 系统地分析了中文学术专著翻译过程的特点和问题，将机器辅助翻译的思想引入中文学术专著的翻译工作中，建立了一个面向中文学术专著的机器辅助翻译模型； 2. 从计算的角度深入考察了专著的语言特点，并在此基础上提出了一个适合中文学术专著的语句相似度计算方法； 3. 在大规模汉英平行语料的基础上，对汉语“被”字结构和英语被动语态的不对应现象进行了定量分析，从中获得了可供辅助翻译系统使用的语言知识。﹀
外文摘要：	︿ In this study, we focus on the computer aided translation for Chinese academic monographs, and build a practical and effective model for translating Chinese monographs into English, a task that requires high quality and is normally done with a high input of time and labor. The new model that we build takes into consideration the special needs of a particular application field, especially the linguistic features involved in that typical application. It will, therefore, lead the translator and the computer to a better way of coordination, thus ensuring the efficiency and quality of a translation task. Based on this model, we further work on the design of a prototype CAT tool, the methodology for language resources development, the sentence similarity measure for Chinese monographs, and the application of bilingual knowledge to computer aided translation. The contributions of this study mainly include: 1. A model for the computer aided translation of Chinese monographs; 2. A design of a prototype CAT system for Chinese monograph translation; 3. A method for building a bilingual parallel corpus; 4. An algorithm for computing sentence similarity in Chinese monographs; 5. A contrastive study on Chinese Bei Construction and English Passive. In Machine Translation studies, controlled language and human intervention are deemed to be the solutions for high quality translation. And this study is probing in the same way for any new potential. We are addressing a very limited field of application, with the above-mentioned achievements ready for use in Chinese monograph translation tasks. Moreover, the concepts, methods and resources we have developed while doing this research can still adapt to other fields, and help the modeling of any new translation tasks and the related system design, resources development, Natural Language Processing techniques and linguistic studies. This study is intended to bring Computational Linguistics, Linguistics and Translation Studies together. As a pioneering work on computer aided monograph translation, this study features considerable originality for the following factors: 1. We have conducted a systematic analysis of the particularity of Chinese monograph translation and brought the concept of Computer Aided Translation into the normal practice of Chinese monograph translation, together with a model for the computer aided Chinese monograph translation, which is ready for use. 2. We have made a thorough survey on the linguistic features of Chinese monographs from a computational perspective, and have applied the findings to the sentence similarity measure for Chinese monographs. 3. Based on a large Chinese-English parallel corpus, we have conducted a quantitative and contrastive study on Chinese Bei Construction and English passive, with the results applied to the prototype CAT system. ﹀
分类号：	H085
论文总页数：	112
参考文献总数：	109
馆藏号：	048/D2004(22)
公开日期：	2004-05-20

基于实体属性的中文网页检索研究.昝红英

链接

题名：	基于实体属性的中文网页检索研究
姓名：	昝红英
学号：	10108835
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	孙斌
导师2单位：	信息科学技术学院
论文答辩日期：	2004-05-20
外文题名：	Studies on the information retrieval of Chinese WebPages based on the entities' attributes
关键词：	信息检索信息提取实体属性相关度评价向量空间模型文本分类中文概念词典查询扩展
外文关键词：	Information Retrieval Information Extraction Entity Attribute Relevance Evaluation Vector Space Model Text Categorization Chinese Concept Dictionary Query Expansion
论文摘要：	︿信息检索是在给定的用户需求下，利用索引、匹配等技术，从大量信息中识别满足条件的信息。传统的信息检索起源于对文本资料的情报检索，近年来因特网的迅猛发展，为人们提供了海量的、动态的Web网页信息。针对实体网页的个性化检索，本文提出了一种有效的检索方法。该方法将检索问题的处理从检索词的机械匹配提升到实体属性的结构化匹配，具体到实体一级对网页内容与实体属性进行相关度的分析与计算，针对性更强，准确率更高，从而为用户提供高效优质的实体网页的个性化检索服务。通过对中文名人网页语料的大量调研，作者提出了名人实体信息的属性结构，并将自然语言理解中的信息提取技术应用于中文网页中有关名人实体信息的提取，设计并实现了基于信息提取的实体网页相关度评价算法和基于组合向量空间模型的实体网页相关度评价算法，同时考察了影响用户评判网页相关性的多种因素，通过参数调整优化名人实体网页的相关度评价模型，并在本文中提供了翔实的实验结果。名人实体网页的相关度评价模型在天网知名度系统中得到了直接的应用，是系统提供名人实体网页个性化检索服务的技术核心。天网知名度系统目前已提供检索服务，其运行结果得到了北京大学—IBM创新研究院的认可。本文研究工作的主要创新点有： Ø 提出了一种新的网上信息检索的工作模式。针对用户的个性化检索需求，利用命名实体识别技术预先对海量网页进行过滤，从而简便高效地实现了根据用户定制的实体信息对海量网页的相关度评价工作； Ø 将自然语言理解的信息提取技术用于网页内容的分析，根据网页中名人实体的属性信息特征，提出了一种加权的布尔模型，用于名人实体网页的相关度评价； Ø 在传统的向量空间模型的基础上，根据网页中名人实体的属性信息特征，设计并实现了一种组合的向量空间模型，从而改善了用户注册实体的存储结构，方便了不同因素的权值调整，有利于提高名人实体网页相关度评价的准确率； Ø 利用中文概念词典对用户注册的实体属性信息进行多个角度的扩展实验，考察了查询扩展后对不同领域名人实体网页的相关度评价结果，细致分析了利用不同概念进行查询扩展的优劣。作者通过对名人实体网页相关度评价的研究与实践，从中探索了一些可行的规律，在一定程度上提高了该领域信息检索的服务质量，为中文实体网页的个性化检索研究提出了一种新的研究视角，同时为进一步的研究工作提供了丰富的实验数据。﹀
外文摘要：	︿ Information Retrieval is to identify the information that satisfies users’ requirements with technologies like indexing, matching and so on. Aiming at the personalization of information retrieval, this dissertation proposes an effective method. The author uses structured entity attribute matching, which is prior to key word matching, and computes the relevance between webpage content and entity attributes at the entity level. With this method adopted, the information retrieval system could achieve better pertinence and higher precision, and thereby provide personalized information retrieval service on entity webpages in an effective and high quality. Applying the information extraction technology in the field of the natural language understanding to the extraction of the celebrities’ attributes information from Chinese webpages, the author has designed and implemented two algorithms for the relevance evaluation on the entity webpage. They are based on the information extraction and the combined vector space model respectively. Besides, the author has provided full and accurate results of experiments. The innovations of this dissertation are as follows: Ø It proposes a new pattern of the web information retrieval. Aiming at the users’ personalized requirements, the model filters massive webpages in advance through the technology of named entity identification. Thereby it realizes the relevance evaluation on massive webpages simply and effectively according to the customized entity information by users. Ø It applies the information extraction technology to the anglicizing on the webpage content. Based on the features of celebrities’ entity attributes in Chinese webpages, the author puts forward a weighted Boolean Model to evaluate the relevance of celebrities’ entity webpages. Ø Based on the conventional vector space model and the features of celebrities’ entity attributes in Chinese webpages, the author designs a Combined Vector Space Model. The new model improves the storage structure of the entities’ attributes, favors the adjustment of the various factors, and boosts the precision of the relevance evaluation on the celebrities’ entity webpages. Ø Using the Chinese Concept Dictionary, the author makes query expansion on the entities’ attribute information that has been registered by users. Then the author investigates the results of relevance evaluation in various fields after query expansion, and particularly analyzes the advantages and disadvantages among the multi-concepts. In this dissertation, the author has explored some feasible rules through the studies and the practices on the relevance evaluation on the Chinese celebrities’ entity webpages. It has improved the service quality to some extent in the restricted field of the information retrieval, put forward a new view for the personalized information retrieval, and provided plenty of experiment data for further research work. ﹀
分类号：	TP391.1
论文总页数：	143
参考文献总数：	103
馆藏号：	048/D2004(23)
公开日期：	2004-05-20

2004-05-19

中文网页褒贬态度的机器评价.苏玉梅

链接

题名：	中文网页褒贬态度的机器评价
姓名：	苏玉梅
学号：	10108083
论文语种：	chi
专业：	计算语言学
公开时间：	公开
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
导师2姓名：	孙斌
导师2单位：	信息科学技术学院
论文答辩日期：	2004-05-19
外文题名：	The evaluation of appraise in Chinese Webpages
关键词：	褒贬态度机器评价网络信息服务
外文关键词：	appraise machine evaluation Web Information Retrieval
论文摘要：	︿天网知名度系统是基于北大天网搜索引擎技术和中文信息处理技术，针对用户定制的实体信息开展的个性化网络服务研究。重点研究了网页实体相关度评价算法，从而优化了针对特定信息的网络查询服务质量。在此研究中，通过对中文网页文本内容的大量观察，作者提出了中文网页褒贬态度机器评价的研究方向。基于褒贬评价的修辞属性，作者确定了以语言手段及领域标准为策略的评价算法，独立完成了网页褒贬态度机器评价模块的全部设计与开发，并为此准备了必要的褒贬评价语言知识库，包括在转化现有的基础静态褒贬义词典的有限资源之外，搜集来自真实中文网页的领域相关褒贬义补充词典，积累了一批褒贬态度表达的语言形式模板。该评价模型针对中文网页，依据领域补充褒贬义词典，对网页实体进行褒贬态度评价，其中包含了一系列评价要素，如褒贬结构、领域标准、实体有关、褒贬猜测等关键方法，关联了多种语言知识，从而更合理地模拟了人对网页褒贬信息的解析方法。该模块被应用到天网知名度系统，通过对75万网页近300个实体的评价测试，得到了有价值的实验结果。网页褒贬相关度评价模型的研究开发，为天网知名度系统的个性化网络服务做出了新的和有意义的尝试。﹀
外文摘要：	︿ TianWang Fame System was focused on personalizing Web Information Services, which was based on TianWang Information Retrieval System and the technology of Chinese Information Processing, mainly developed the relevance evaluation on entity WebPages and improved the sorting quality of the information retrieval aimed at the customized requirements, following which the author has proposed the evaluation of appraise in Chinese WebPages as the extended research. In view of cognitive rhetoric, the author developed the appraise evaluation method based on the language knowledge and domanial criterion about the entity. The author has finished the implement of this module and embedded it into TianWang Fame System. As the necessary language knowledge for machine evaluation, the author has formalized an elementary appraise dictionary and established several domanial supplemental lexicons from actual Chinese WebPages, especially a series of appraise template. This module gave the evaluation focused on WebPage entity and domanial criterion, including the following factors, such as entity-aboutness, appraise-structure, appraise-hypotheses. Through associating several linguistic materials and simulating the cognizance of appraise rhetoric of human, the research has gotten the value result and carried a new exploration in Web Information Services. ﹀
分类号：	H087
论文总页数：	40
参考文献总数：	10
馆藏号：	048/M2004(114)
公开日期：	2004-05-19

基于PAT-Tree的领域关键词自动提取.吴拥华

链接

题名：	基于PAT-Tree的领域关键词自动提取
姓名：	吴拥华
学号：	10108108
论文语种：	chi
专业：	计算机软件与理论
公开时间：	公开
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	信息科学技术学院
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2004-05-19
外文题名：	PAT-Tree based domain specific keywords extraction
关键词：	关键词提取领域关键词自动提取 PAT-Tree 互信息
外文关键词：	Keyword extraction Domain specific keyword extraction PAT-Tree Mutual information
论文摘要：	︿本文介绍了作者开发的面向领域的关键词提取系统。面向领域的关键词提取系统自动提取特定领域语料库中的领域关键词，并且可以发现普通词典里面没有的新词。提取过程基于从原始文本中得到的统计信息，取出符合筛选条件的字符串。总体来说分为四个阶段，分别为：在原始文本上建立PAT-Tree，获取文章词频信息；在PAT-Tree上抽取候选关键词；对关键词过滤以及选取领域关键词。我们把提取的重点放在了自动过滤符合统计条件的字符串，进一步精选候选关键词上面。我们在精选过程中采用了新的过滤手段，并借鉴了其它方法的优点，有机地与系统结合在一起，形成了一套综合的过滤手段，有效地提高了精确度，减少了计算量。系统的另外一个特色，是使用分治法的思想来处理密集计算，高效地建立PAT-Tree，一方面为提取领域关键词提供了方便，另一方面也使得系统能够用分布式计算的方法来实现，提供了进一步扩大处理能力的空间。试验结果表明，系统能够高效地提取领域关键词，并且取得了良好的效果，达到了预期目的。﹀
外文摘要：	︿ This paper describes a Domain Specific Keyword Extraction system we developed, which can extract domain specific keywords and new words not presenting in common dictionary efficiently. Our method is based on statistic and work in raw Chinese text corpus. It is a four-stage system, including building PAT-Tree, extracting keywords, filtering keywords and choosing domain specific keywords. One of the novel characteristics is that we put our end to the filter phrase, proposing a new way to filter except for implementing some other filters to increase the accuracy and reduce the computation. Another characteristic is that we implemented a new efficient PAT-tree, which is available for distributed computational system. The result shows that our system is comparable to other Keyword Extraction works and we achieved the expected goal. ﹀
分类号：	H087
论文总页数：	67
参考文献总数：	27
馆藏号：	048/M2004(136)
公开日期：	2004-05-19

2003-05-01

书面藏语语法信息知识库的设计与应用研究.陈玉忠

链接

题名：	书面藏语语法信息知识库的设计与应用研究
姓名：	陈玉忠
学号：	10008822
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
专业：	计算机软件与理论
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2003-05-01
外文题名：	A Research of Design and Application of Grammar Information Knowledge Base of Written Tibetan
关键词：	藏语信息处理语法信息知识库藏语分词汉藏机器翻译
论文摘要：	︿本文结合藏语信息处理的实际需要，首次在藏语信息处理的相关理论、方法和应用方面作了积极的探索。其中，藏语语法信息表述和自动分词等方面的相关研究成果，一方面有望对藏语信息处理技术的整体发展产生直接的推动作用，另一方面也希望能够对中文信息处理技术的相关研究提供积极的参考。﹀
外文摘要：	︿本文结合藏语信息处理的实际需要，开展了面向信息处理的书面藏语语法信息表述和应用研究。全文共七章。第一章对藏语信息处理技术的研究状况以及目前国内外信息处理的研究现状进行了宏观分析。以此为背景，确定了如下的指导原则：首先是关注本体研究，其次是积极引入新的理论和方法开展研究，其三是在有机结合的基础上的创新和抽象。研究过程中以完成藏语语法信息的形式化表述为重点，技术实现为核心，促进应用为目标。第二章以传统语法理论为根本，积极借鉴现代藏语语法研究的相关成果，提出了以功能分类为主、形态分类为辅的分类方案，得到了一个信息处理用现代藏语词语的分类体系。同时吸收了配价理论、动词格框架的合理思想，结合藏语特点提出了一个藏语动词再分类框架。并采用功能分类与属性特征描述相结合的研究方法，初步确定了一套表述藏语词语语法属性信息的基本框架。第三章在藏语语法信息表述框架的基础上，依据计算机处理藏语的特点和藏语语法信息知识库的建设目标，重点对藏语谓词中的动词类和形容词，体词中的名词、数词和代词，格助词和接续词以及藏字的语法属性信息进行了详细的分类表述研究。结合工程应用的实际需求，完成了所有14400余个藏字的属性信息和12万余条词语中大部分词语属性信息的描述实践。第四章在详细分析藏语文本特点和各级切分特征的基础上，提出了一个基于格助词和接续特征的书面藏语分词方案（Based on Case-auxiliary word and Continuous Feature，BCCF），并在藏字、词语法信息库的支持下，设计实现了一个基于格助词和接续特征的书面藏语分词实验系统。实验结果表明，本系统对不同领域的藏语语料表现出较强的适应性，因而说明了系统具有较强的通用性。第五章对藏语句子模式特点进行了比较全面地总结和分析，结合藏语的特点采用配价和格关系对藏语基本句式进行了分类。在此基础上探讨了藏语句子的分析策略，结合藏语单动词句的特点提出了一个基于句节的依存分析方法（Sentence-Node-Based Dependency Parsing Methods ，简称SNBDPM方法）。对多动词句的各类歧义特征进行了考察，并对多动词句的句法特征做出了一些有意义的假设，为进一步开展藏语自动分析研究打下了基础。第六章主要讨论了班智达汉藏机器翻译系统的设计和实现方面的问题。为了从一个更广的范围和更高的角度来认识和把握开展汉藏机器翻译研究的价值和意义，首先对机器翻译的总体研究情况进行了简单的概述，接下来结合研制班智达汉藏机器翻译系统的实际，对系统结构、词典的组织、规则的结构、藏语生成、系统规模和界面设计、班智达系统存在的问题以及今后努力的方向等进行了详细的讨论。第七章对课题研究工作进行了全面的总结，并对进一步的研究工作进行了展望和规划。综上所述，本文首次在藏语信息处理的相关理论、方法和应用方面作了积极的探索。其中，藏语语法信息表述和自动分词等方面的相关研究成果，一方面有望对藏语信息处理技术的整体发展产生直接的推动作用，另一方面也希望能够对中文信息处理技术的相关研究提供积极的参考。﹀
分类号：	H087
论文总页数：	133
参考文献总数：	161
馆藏号：	048/D2003(24)
公开日期：	2003-05-01

汉语新闻报道中的话题跟踪与识别研究.李保利

链接

题名：	汉语新闻报道中的话题跟踪与识别研究
姓名：	李保利
学号：	10008831
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	信息科学技术学院
专业：	计算机软件与理论
导师1姓名：	俞士汶
导师1单位：	信息科学技术学院
论文答辩日期：	2003-05-01
外文题名：	Studies on topic tracking and detection in Chinese news stories
关键词：	话题识别与跟踪文本分类文本聚类信息检索 k最近邻居
论文摘要：	︿作为自然语言处理一个新的研究方向，话题识别与跟踪旨在发展一系列基于事件的信息组织技术，以实现对新闻报道信息流中新话题的自动识别以及对已知话题的动态跟踪。自1997年以来国际上连续举行的多次大规模评测使得话题识别与跟踪研究正逐步成为近来自然语言处理尤其是信息检索领域的一个研究热点，目前国内在这方面的研究尚处在起步阶段。本文对汉语新闻报道中话题跟踪与话题识别技术进行了研究，提出并探索了多种不同的算法，并且尝试将更多自然语言处理技术用于话题识别与跟踪研究中。主要取得了以下一些研究成果： (1). 比较了多种分类算法在中文文本分类中的性能，并对传统的k最近邻居算法做了改进，以减低参数k的选取对分类系统性能的影响，使之更适合于在动态、联机以及类别分布差异较大的场合应用，并将这种改进的kNN算法用于话题识别研究中。 (2). 通过将输出结果概率化以及为不同类别设置不同的错误代价，将支持向量机算法应用于话题跟踪领域，并提出了kNN与SVM相结合的多种话题跟踪策略。实验表明，采用最近邻居策略的串行结合方式比较有效。 (3). 结合改进的kNN算法，提出了一种新的单遍聚类话题识别算法，并尝试通过一个“遗忘因子”来动态调整话题类向量，以模拟话题内容随时间变化的特点，取得了较理想的效果。 (4). 探索了在话题识别与跟踪中引入深层次自然语言处理技术的可能性，尝试借助浅层分析技术扩充文本表示特征，并且对汉语新闻报道中日期表达式的识别与规范化处理给出了一套解决方案。 (5). 设计、实现了面向汉语新闻报道流的话题跟踪与话题识别原型系统。尽管本文的研究主要以汉语新闻报道为处理对象，但相信文中提出的大部分算法和处理策略也适用于对其他语言的处理。﹀
外文摘要：	︿ As a new direction of research on natural language processing, Topic Detection and Tracking aims at developing technologies for event-based information organization, such as detecting stories on novel topics and tracking stories on known topics. Since 1997, a series of evaluation on this research have been conducted, and made it more and more popular in Natural Language Processing, especially in information retrieval. The research on topic detection and tracking in China is just starting. In present dissertation, studies on Topic Tracking and Detection in Chinese news stories are carried out. We propose and explore several algorithms for Topic Tracking and Detection. And we try to apply more NLP technologies in TDT, which is usually seen as a branch of Information Retrieval. In summary, we achieve several results as follows: Firstly, we compare several widely used learning algorithms for Chinese Text categorization. And we propose a modified k-nearest neighbor algorithm, which is insensitive to the choice of the parameter k, as compared to the traditional kNN algorithm. This revised version is suitable to some on-line and dynamic applications, where estimating the parameter k via cross-validation is not allowed and the document distribution of different classes is not even. We apply this modified kNN algorithm in Topic Detection. Then, we try to employ the SVM algorithm in Topic Tracking, which exhibits much better performance in Text Categorization. By making the output of SVM probability and setting mis-classification penalties for different classes, SVM seems to be equally effective as kNN in topic tracking. Moreover, we propose several strategies to combine kNN and SVM to deal with Topic Tracking. Experiments show that methods with cascaded classifiers, which are trained on selected samples by kNN strategy, are more effective than others. Next, we investigate several algorithms for topic detection. A revised single-pass clustering algorithm, which is combined with our modified kNN algorithm, is proposed. To simulate the fact that contents of stories on-topic are changed with time, we introduce a forgetfulness coefficient to adjust the topic vector in the above algorithm. By this coefficient, the topic vector becomes a time-weighted sum of story vectors on the topic. Besides the methods borrowed from information retrieval, several natural language processing technologies are also explored to tackle the problem of topic detection and tracking. We propose a method to utilize a shallow parser to get features with more syntactical information. And a solution to recognize and normalize the date expression in Chinese news stories is provided. Preliminary experiments indicate that these methods are promising although the improvements in performance by these NLP technologies are limited. Finally, we design and implement a topic detection and tracking prototype system to deal with Chinese news stories, which is a good base for further stu ﹀
分类号：	TP391.1
论文总页数：	99
参考文献总数：	99
馆藏号：	048/D2003(28)
公开日期：	2003-05-01

2001-05-01

语用失误及其对英语教学的启示.梁润生

链接

题名：	语用失误及其对英语教学的启示
姓名：	梁润生
学号：	19838015
论文语种：	英语
专业：	英语语言文学
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
专业：	英语语言文学
导师1姓名：	何卫
导师1单位：	北京大学外国语学院
论文答辩日期：	2001-05-01
外文题名：	Pragmatic failure and its Englishtenment to ELT
关键词：	speaking rules pragmatic failure competence cultural awareness sensitivity
论文摘要：	︿ This thesis aims at evaluating pragmatic competence of Chinese learners of English and seeking ways to change the traditional teaching method of stressing language competence into a new advanced teaching approach of emphasizing pragmatic competence. So, in the introductory pat, the general problems existed in the traditional teaching program are discussed, which, in the author's eyes, is a direct reason of pragmatic failure made by Chinese students. The general features of speaking rules are presented in the following chapter from perspectives such as Speech Act Theory by Austin, Indirect Speech Act by Searle, the Cooperative Principle by Grice, and the Politeness Principle by Leech. It is believed that they are speaking rules of language. Because of cultural differences, however, speaking rules often vary from country to country. When interacting with native speakers of English, therefore, Chinese learners need to observe English speaking rules. If they cannot, but just take any English word or sentence literally, they will never be likely to understand native speakers correctly. Frequently, pragmatic failure will result. However, there has been a misconception that once language learners have grasped the systematic knowledge of a foreign language in terms of phonetics, grammar and vocabulary, they will be able to use the' language automatically. The same is true for English teaching in China in that it has long attached much importance to the teaching of so-called common core of the language, on the assumption that when equipped with the solid basis of English in phonetics, grammar and vocabulary, Chinese learners will be able to use the language correctly and communicate successfully. This is not necessarily the case. Therefore, a questionnaire is designed and given to l86 undergraduate students of non-English majors at Yan'an University to test their pragmatic competence. The result is shown in Chapter 2 and reveals that the pragmatic failure of the students is teaching-induced. To develop Chinese learners' ability in intercultural communication, therefore, is not only to develop their language competence, but pragmatic competence as well. Based on this consideration, the chapter followed suggests that, in doing so, it is important to develop learners' abilities in terms of pragmatic competence, cultural awareness and pragmatic sensitivity. The thesis comes to the conclusion that once equipped with these abilities the learners will be 'able to communicate with native English speakers both accurately and appropriately. ﹀
论文总页数：	52
参考文献总数：	0
馆藏号：	039/M2001(59)
公开日期：	2001-05-01

关于常见错误“他”和“她”混用的改正.王巧云

链接

题名：	关于常见错误“他”和“她”混用的改正
姓名：	王巧云
学号：	19738028
论文语种：	英语
专业：	英语语言文学
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
专业：	英语语言文学
导师1姓名：	何卫
导师1单位：	北京大学外国语学院
论文答辩日期：	2001-05-01
外文题名：	Popular error correction: "he" or "she"
关键词：	popular oral errors self-recording cognitive behavior modification
论文摘要：	︿ Some oral errors are always ''Popular". For example, many Chinese beginners who learn oral English misuse the third person singular pronouns ''he" and "she". They know about these mistakes, but they fail to correct them. As a result there is a need to find out an effective method for correcting these errors. Nevertheless, in the field of error correction the teacher's rectifying role has been overtly emphasized. This has resulted in student passive status in receiving teacher correction results and eventually rendered the students incapable of correcting their own errors. In order to initiate students to self- correct their errors, this study was proposed. It intended to try out a new method for correcting the popular error--the incorrect use of ''he" or "she"--by following the self -recording procedures advocated by Cognitive Behavior Modification theory The theoretical basis is founded on the principle of reinforcement in behavioral psychology. A two-month experimental study was carried out to examine the effectiveness of this approach to popular error correction. Two groups of students were involved in the study: a control group that did nothing with the errors and an experimental group that received a self-recording treatment. The experimental results of this study confirmed the hypothesis that the students of the experimental group would make less such errors--the incorrect use of "he" or "she" in their speech-than those of the control group after a two-month self-recording treatment. It showed that the advocated method was effective in reducing the popular error frequencies. ﹀
论文总页数：	48
参考文献总数：	0
馆藏号：	039/M2001(57)
公开日期：	2001-05-01

名词性短语的形式化研究.刘鸿勇

链接

题名：	名词性短语的形式化研究
姓名：	刘鸿勇
学号：	19838008
论文语种：	英语
专业：	英语语言文学
培养层次：	硕士
学位：	文学硕士
培养单位：	北京大学
院系：	外国语学院
专业：	英语语言文学
导师1姓名：	何卫
导师1单位：	北京大学外国语学院
论文答辩日期：	2001-05-01
外文题名：	A formal study of nominal phrases
关键词：	Nominal Structure Movement
论文摘要：	︿ This paper presents a formal study of nominal phrases, which aretraditionally called Noun Phrases (NPs) and recently called Determiner Phrases (DPs) on the basis of three theoretical models of the generative grammar: Phrase Structure Grammar, the Government-Binding (GB) Theory, and the Minimalist Program.This paper consists of four parts. Part One, Introduction, gives a survey of the study of nominal phrases and introduces three models, which jointly constitute the theoretical background of this thesis.Part Two examines the internal structure of nominal phrases. In order to overcome the problems inherent in Phrase Structure Grammar, the X-bar framework is proposed, which includes a set of principles: Projection Principle, theta criterion, and the case theory. With the interaction of all these principles, we can generate proper NPs. With the development of the X-bar theory, nominal phrases are no longer regarded as the maximal projection of Nouns, but rather that of Determiners. This is called the DP hypothesis. After the introduction of the DP hypothesis, we examine the reason for DPsto replace NPs with some Chinese examples.Part Three examines the external structure of DPs. Within different theoretical framework (the GB Theory and the Minimalist Program respectively), the external structure (i.e. the functioning of nominal phrases at the level of sentences) has to be described differently. We concentrate on two representative types of structures—passive sentences and raising structures—in both ofwhich the movement of DPs plays an essential role in forming thestructures.Based on the above analysis, the last part comes to atentative conclusion concerning the status of nominal phrases inthe Chinese language: nominal phrases in the Chinese language are DPs with a determiner and a lower-level DP before a noun. ﹀
论文总页数：	53
参考文献总数：	0
馆藏号：	039/M2001(38)
公开日期：	2001-05-01

现代汉语非受限文本的实语块分析.孙宏林

链接

题名：	现代汉语非受限文本的实语块分析
姓名：	孙宏林
学号：	19608806
论文语种：	汉语
专业：	计算机软件与理论
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	计算机科学技术系
专业：	计算机软件与理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算机科学技术系
论文答辩日期：	2001-05-01
外文题名：	A content chunk parser for unrestricted Chinese text
关键词：	计算语言学自然语言处理浅层分析信息提取实语块
论文摘要：	︿对非受限的自然语言文本进行自动句法分析目前仍是自然语言处理所面临的一个巨大挑战，即使对于英语这样得到充分研究的语言至今也还没有一个可以处理非受限文本的高性能的句法分析器。解决句法分析难题的途径之一是采取“分而治之”的策略，即将复杂的句法分析任务分解为若干相互独立的子任务。本文提出的实语块分析就是根据这种思想而提出的一种浅层句法分析任务，其目标是从文本中连续的实词串中分析出可能的结构。由于可以在很大程度上避开跟许多虚词相关的远距离依赖问题，因而实语块分析可以得到很高的性能和效率。实语块分析的结果可以使句子的结构得到简化，从而降低完全句法分析的歧义和复杂度。本文的研究表明，实语块分析是一个可以明确定义、相对独立的句法分析子任务，与基本名词短语分析等浅层分析任务相比，它可以得到更多的句子结构信息。本文描述了一个完整的汉语实语块分析系统，该系统接受非受限的自然语言文本作为输入，输出包括分词、词性标注、命名实体识别和实语块分析的结果。具体地说，本文取得了以下成果：（1）提出了汉语实语块分析任务，并通过实验证明实语块分析是可以明确定义、相对独立的句法分析子任务。（2）提出了概率上下文无关语法和概率属性相结合的汉语实语块分析模型。该模型利用跟规则相关的句法属性对上下文无关语法进行约束。这些句法属性的可学习性使模型更具有健壮性。同时，由于句法属性是带概率的，因而使模型具有更强的消歧能力。（3）根据汉语句法中节律对句法具有约束作用的性质提出了利用音节和长度信息对规则进行约束的思想，并使节律特征概率化。（4）提出了汉语并列结构的概率模型，该模型利用并列结构的对称性在若干并列项候选中选择正确的并列项。（5）提出了对词类标记集复杂度的定量分析方法，并在此基础上提出了基于遗传算法的词类标记集优化方法。（6）实现了一个完整的汉语实语块分析系统，该系统以非受限的汉语文本作为输入，以实语块分析结果作为输出。除了得到实语块之外，还可以得到分词、词性标注和命名实体识别的信息。本文的研究成果可以应用到信息提取、信息检索、机器翻译等自然语言处理系统中。在本文描述的实语块分析系统之上，有望开发出能够处理非受限文本的汉语句法分析器。﹀
外文摘要：	︿ Automatically parsing unrestricted natural language text is still a great challenge to natural language processing (NLP) at present. Even for the well-studied languages like English, a robust parser, which can deal with unrestricted text with high performance, is not available until now. An approach to the difficult problems like parsing is to take a divide-and-conquer strategy, i.e., dividing the whole complicated problem into several independent sub-problems, which can be solved relativelyeasily. Content chunk parsing, proposed in this thesis, is such a kind of subtasks in parsing, whose object is to acquire possible structures from a sequence of content words. As much long-distance dependence associated with function words, for example, the determination of the right boundary of a preposition phrase and the left boundary of a “DE structure” in Chinese, can be avoided, content chunk parsing can obtain high performance and efficiency. The result of content chunk parsing can make a sentence simplified and lead to a noticeable decrease of ambiguities and complexities in full parsing. The research in this thesis indicates that content chunk parsing is a well-defined, easy to understand and relatively independent subtask in parsing. Compared with baseNP analysis, it can acquire more structural information from sentences. This thesis describes a whole content parser which takes unrestricted Chinese text as input and gives content chunks as output in which the information about word-segmentation, POS tagging, Named Entity recognition can be gained as well. The thesis makes contributions as follows: (1) It defines the task of content chunk parsing and proves that content chunk parsing is a well-defined, easy to understandand relatively independent subtask in parsing that can be dealt with independently. (2) It proposes a model integrating probabilistic context-free grammar (PCFG) and probabilistic features for content chunk parsing. This model makes constraints on PCFG utilizing the syntactic features associated with specific rule. The learnability of the syntactic features makes the model robust and the stochastic nature of the features gives the modelmore power for structural disambiguation. (3) It proposes the idea of using rhythm feature for constraining the over-generation of context-free grammar rules, based on the observation that rhythm can have an effect on syntax in linguistics. (4) It proposes a probabilistic model for coordinate constructions based on the observation that the members of coordinate constructions tend to be symmetric in both syntactic and semantic aspects. The model can be used for making correct choices on coordinate members among two groups of candidates on both sides of a coordinate conjunction. (5) It proposes a quantitative method for measuring the complexity of a part-of-speech tag set and proposes a method foroptimizing POS tag set using genetic algorithms. (6) It implements a whole content chunk parser which takes u ﹀
论文总页数：	99
参考文献总数：	0
馆藏号：	008/D2001(01)
公开日期：	2001-05-01

基于词汇语义分析的唐宋诗计算机辅助深层研究.胡俊峰

链接

题名：	基于词汇语义分析的唐宋诗计算机辅助深层研究
姓名：	胡俊峰
学号：	19808806
论文语种：	汉语
专业：	计算机软件与理论
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	计算语言学研究所
专业：	计算机软件与理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算语言学研究所
论文答辩日期：	2001-05-01
外文题名：	The lexicon meaning analysis-based computer aided research work of Chinese ancient poems
关键词：	语料库语言学唐宋诗辅助研究未登录词提取语义分析构词法
论文摘要：	︿唐宋诗计算机辅助研究是北大计算语言所的一个长期项目。论文的研究工作开始于1996年。本研究基本上可以看作是一个应用驱动型的研究，研究工作的大部分内容是围绕着‘唐宋诗计算机辅助研究系统’的开发来进行的，其成果也大都作为该应用系统的一个部分得到了验证与应用。本研究主要分为以下几个大的方面。1、基于唐宋诗语料库的词汇自动提取研究与基于词汇的统计知识库的构建通过引入二字组的‘插入率’、‘相对共现度’等统计参数，对原有的基于互信息的未登录词提取算法进行了改进。在针对640万字唐宋诗语料进行多字词词汇自动发现的基础上，通过人工校对提取多字词41732条。建立了古诗词人名、地名及词汇索引。从作者、时代等多个角度提取了相应的字频、词频等统计信息，为作者、时代的风格研究提供了数据基础 2、唐宋诗计算机辅助研究系统的开发及应用系统在上一步研究的基础上，针对唐宋诗研究的需要，以多条件复合检索技术为依托，建立了基于词的唐宋诗检索、统计分析系统。提供了词汇的共现、对仗以及作者特征分析等统计功能。在检索功能的基础上开发了诗句相似性检索，自动注音等多种功能，建立了一个面向实用的唐宋诗计算机辅助研究系统。3、基于统计的词汇语义关系的自动发现语言的研究最终很难回避对意义的研究。本项研究以比较词汇的同用词集合的相似性为入手点，通过统计的手段对词汇之间的语义相似性进行量化。进一步提出了语义距离的定义及算法。在此基础上构建的词汇近义关系网络以及基于近义关系的唐宋诗检索引擎为该项研究从应用的角度提供了一个评价的标准。4、汉语构词规则的自动提取与研究本研究以北大计算语言学研究所开发的《现代汉语语法信息词典》（1999年版，共收录7万词）为基础，根据其中收录的40318个双字词及其词法、语法属性，参照‘现代汉语语素库’中对语素的属性描述，对每一个双字词的内部结构进行了标注。在此基础上，运用统计的方法，在字一级提取了21301条构词规则，并进一步以宋代诗歌语料为对象，对所得结果的实用性进行了验证。最后，作为理论上的推广，提出了汉语广义构词结构的概念；将通行的分词词典与构词法纳入统一的理论体系中，为今后进一步的研究提供了理论基础。﹀
外文摘要：	︿ The computer aided research work of Chinese ancient poems is a long-term project of ICL/PKU. The work included in this paper start from 1996. This research work can be viewed as an application driven project. Almost all the works, at the beginning, started with the requirement of the Computer Aided Analysis System of Chinese Ancient Poems and also, in the end, proved and applied in that system. The research work consists with three main part: 1、The corpus based lexicon (Meaningful Units, MU extraction andthe establishment of the statistical lexicon database of Chineseancient poems. Based on 6.4 million Chinese characters ancient poems, a statistic model is introduced, which include three different statistic standard, frequency, insert rate and mutual-information. In this case, not only the frequent used MU are recognized, a lot of less frequent ones, because of its verylow insert rate and high mutual-information, are also been found. In the mean time, after finished the segmentation of MU, MU-based collocation and antithesis information is acquired automatically. 2、The development of the ‘Computer aided research system of Chinese ancient poems’ Besides the full text retrieving function, the system provide MU-based statistic analysis, sentence based similarity retrieving, automatic Pinyin tagging and some other useful functions to benefit the deep level analysis of the Chinese ancient poems. The National Social Science Foundation 1998-1999 funded the project. 3、Corpus based similar context words extraction A statistic approach has been made to find out the Similar Functional Sets within 12,342 frequent used words. The further research work on the SFS finally lead to the automatic explication of the different sense of each individual word. 4、The automatic acquisition of the word formation rules from the tagged dictionary By marking up the inner structure of all the two char word inthe Grammatical Dictionary of Contemporary Chinese, 21301 rules of productive morphology have been gotten automatically which give a detailed description of the productive ability of each individual Chinese char. With the application of these rules in the unlisted word discovery of Chinese ancient poems, a result lead to 11% upgrade of the accuracy rate has been reached. Further more, as a theoretical approach, the Universal Productive Morphology has been defined which lead to a productive definition of Chinese word. ﹀
论文总页数：	58
参考文献总数：	0
馆藏号：	008/D2001(10)
公开日期：	2001-05-01

2000-06-01

继承——归纳机制及其在对象系统和信息提取技术中的应用.孙斌

链接

题名：	继承——归纳机制及其在对象系统和信息提取技术中的应用
姓名：	孙斌
学号：	19708806
专业：	计算机软件与理论
公开时间：	公开
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	计算机科学技术系
专业：	计算机软件与理论
导师1姓名：	俞士汶
导师1单位：	北京大学
论文答辩日期：	2000-06-01
关键词：	对象关系继承归纳软件技术面向对象
论文摘要：	︿继承在当前的对象技术中起着非常重要的作用。很长时期以来，继承是大多数面向对象模型采用的对象关系机制。它基于真实世界中对象共享属性和行为、对象之间存在着从一般到特殊的关系这一事实。这允许我们把对象的关系表示为“自顶向下”的特征传递，即于对象从其父对象获得属性和方法，并且可以增加新的特征。尽管对于许多应用领域而言继承是一种很好的机制，但它只是对象众多关系之中的一种。当对象技术运用到越来越多的领域时，仅有继承机制有可能是不足的，而需要考虑新的对象关系，尤其是与继承方向相反的特征传递关系。本文讨论了两种与继承方向相反的特征传递关系，包括特征归纳和对象相互扩充。由于现有的对象模型不直接支持对象特征的逆向传播机制，在其中试图表达这些关系是十分困难的。本文首次在相关的计算机理论和技术领域引入了归纳关系，并将其同已有的继承机制结合起来，对二者的结合作出了比较完整的阐述。归纳机制是显式地表示对象间共性综合所必要的，而它往往被大多数的对象模型排除在求解过程之外，不被直接支持。继承和归纳是模拟计算对象的两个互相补充的过程，二者的结合不仅可能，而且是必要的。论文包括对归纳和继承关系的一般性描述、继承--归纳网络（IIN）、双向派生模型、归纳于类型等，揭示了继承一归纳关系的一系列重要和有用的性质，并且探索了它们在对象模型、软件技术、编程理论和方法、自然语言处理、信息提取等方面的应用，给出了两个具体的应用实例，即XOO模型与C编程语言、InfoX信息提取系统。通过建立对象之间的虚归纳关系，C语言比较理想地解决了困扰当前大多数OOP系统的逆向抽象问题，并由此建立起一种带有普遍性的软件开发范式，即归纳范式。InfoX模型则将特征结构的继承一归纳关系运用到信息提取技术中，将当前IE系统中孤立的信息描述模板以信息类型的关系联系起来，使信息类型的重用称为可能，并提供了通过类型关系推断来提高处理效率和可靠性的途径。﹀
外文摘要：	︿ The description mechanism of object relations plays a very important role in the fie1d of object technology. For a long term, the object relation mechanism adopted by most object-orientation models is inheritance, which is based on the fact that objects in the real world share attributes and behavior, and there exists a relationship from the general to the special among some objects. This allows us to express object relations by "top-down" feature propagation, that is, children objects get some of their attributes and methods from their parent(s), and they may also add some new features, which reflects our thinking process of problem solving from the abstract to the concrete. While inheritance is good and powerful for many application domains, it's just one of the various object relations, which is a "one-way" derivation. New mechanisms are necessary when we apply the object-orientation principles to more complex problems where some new fundamental object relations beyond inheritance need to be taken into account. In particular, feature propagation in the reverse direction to inheritance exists. These can cause difficulties that are mainly related to the inheritance mechanism of current OO systems. A new approach is necessary to take the reverse propagation of object features into consideration in a systematical way leading to an extended framework of current OO model. This paper discusses two fundamental bottom-up feature propagation relations, which includes feature induction and bottom-up derivation. Induction is a necessary mechanism to express explicit commonality summing-up among objects, which is often excluded out of the problem solving models and is not supported directly by most of the current OO systems. Another kind of bottom- up feature propagation is feature exchange between objects;that is, two objects get their features from each other. By addressing a few typical problems that may cause substantial difficulties to existing OO technology and devising corresponding answers, this paper presents a full description of the ideas of object induction, as well as the inductive polymorphism -- virtual induction, which is a powerful mechanism for abstracting and reusing existing code and can be well blended with inheritance. We show by examples that extensible systems can be bui1t based on the combination of virtual induction and inheritance. Induction is an essential mechanism of the XOO model. C** is the implementation of the XOO model under the context of an efficient, terse and general-purpose programming language along the way of C and C++. The inheritance-induction mechanism can also be applied to the information extraction technology, permitting type inference facilities for the recognition of related information templates. The InfoX system presented by this paper is the first IE system that constructs a reusable information template type hierarchy with inheritance-induction and employs such kind of information subtyping relations i ﹀
分类号：	TP311.1
论文总页数：	190
参考文献总数：	103
馆藏号：	008/D2000(03)
公开日期：	2000-06-01

2000-05-01

双语语料库的 XML 表示及其自动分类方法研究.程兆炜

链接

题名：	双语语料库的 XML 表示及其自动分类方法研究
姓名：	程兆炜
学号：	19708008
论文语种：	汉语
专业：	计算语言学
培养层次：	硕士
学位：	理学硕士
培养单位：	北京大学
院系：	计算机科学技术系
专业：	计算语言学
导师1姓名：	俞士汶
导师1单位：	北京大学计算机科学技术系
论文答辩日期：	2000-05-01
关键词：	双语语料库 XML 文档自动分类反比类文档频数 Find Similar算法
论文摘要：	︿随着自然语言处理研究的发展，双语语料库的研究也越来越受到重视。为了更好地对双语语料库进行管理和维护，必须建立一个“双语语料库管理平台”。本文对建立该平台的两个基础工作——双语语料库的表示方法和双语语料库的自动分类——进行了探索。本文首先探讨了如何使用XML来表示双语语料库。XML是一个可扩展的标记语言，本文利用它的可扩展特性，吸收了CES的一些优点，根据双语语料库的需要，定义了一个适用于标注双语语料库的文档类型定义。在提出对双语语料库的分类之前，本文介绍了文档自动分类算法研究的分类、文档表示模型和常用的分类算法。接着本文在现有算法的基础上提出了一个改进的文档自动分类算法。通过对当前算法的研究，针对双语语料库的特点进行了改进，并实现了该算法。实验结果证明，该算法能实现对单、双语文档进行分类，并且能在一定程度上提高分类的精确率和召回率。﹀
论文总页数：	53
参考文献总数：	0
馆藏号：	008/M2000(07)
公开日期：	2000-05-01

1999-06-01

非确定性函数类与 SAT 的结构性质.刘田

链接

题名：	非确定性函数类与 SAT 的结构性质
姓名：	刘田
学号：	19608805
论文语种：	汉语
专业：	计算机软件与理论
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	计算机科学技术系
专业：	计算机软件与理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算机科学技术系
论文答辩日期：	1999-06-01
外文题名：	Nondeterministic classes of functions and structural properties of SAT
关键词：	NP 搜索问题的复杂性非确定性函数类 SAT 的结构性质并行搜索归约为判定可挑选性
论文摘要：	︿本文内容包括两个部分，分别是关于非确定性函数类和SAT的结构性质．本质上本文研究非确定性多项式时间问题类NP的搜索问题的计算复杂性．上述两个部分的内容研究的分别是NP非完全集和NP完全集。为了研究各种NP非完全集搜索问题的复杂性，本文分别定义了对应于NP和NP的子类Fewp和UP的非确定性函数类，并且定义了求解各种NP搜索问题的子搜索问题的函数类, 本文研究了前一类函数类之间的包含关系和复合关系, 证明在这些类中有一个新类是Fewp的对应物．本文研究了后一类函数类之间的包含关系和归约关系．提出了关于其中一个类的一个猜想．这些结果对于理解各种NP非完全问题的搜索问题的复杂性是重要的，因为它们提供了比较和分类各种NP非完全集搜索问题的复杂性的框架．关于NP完全集搜索问题的复杂性，最重要的假设是‘SAT是多项式时间并行地搜索归约为判定’的假设．因为用现有的证明技术不可能绝对地解决这个假设，本文研究了这个假设与其他关于SAT结构性质的假设之间的关系，证明了如果‘NP有多项式时间图灵归约下的稀疏完全集’则‘SAT是多项式时间并行地搜索归约为判定’，以及如果仅设‘P不等于NP’，则要么‘SAT不是多项式时间并行地搜索归约为判定’，要么‘SAT不能用多项式时间真值表归约归约为有界可近似集’，并且证明假设‘P不等于NP’，则‘NP在多项式时间析取归约下没有稀疏完全集’．这些结果在各种关于SAT的难以解决的假设之间建立了联系，推进了对于SAT的结构性质和对于NP完全问题的搜索问题的复杂性的知识，是使用目前的证明技术所能够设想的最佳研究路线．这些结论对于所有NP完全集都是适用的．证明这些结论的关键引理是‘所有NP稀疏集是可以用多项式时间并行查询SAT所计算的函数来打印的．’该引理改进了过去关于稀疏NP集的可打印性结果．本文的第二部分还用集合类代替函数类研究了对可挑选性的扩充定义，证明了对于文献中已知的一些扩充定义，这两种方法给出相同结果．﹀
外文摘要：	︿ This thesis contains two parts of contents which are about nondeterministic classes of functions and are about structural properties of SAT respectively Es-sentially this thesis studys the computational complexity of search problems for nondeterministic polynondal-time class NP. The above two parts of contents study NP non-complete sets and NP complete sets respectively. To study the complexity of serch problems for NP noncomplete sets, nondeter-ministic classes of functions, which are analogues to NP and its subclasses FewP and UP, are defined in this thesis. Also defined are classes of functions which solve the sub-problems of NP serach problerns. The inclusion and composition relations between the first classes of functions are studied in this thesis. Among these classes there is a new class which is an analogue of FewP. The inclusion and reduction relations between the second classes of functions are studied. A conjecture is posed about one class among these dasses. These results are important to understand the complexity of search problems for NP non-complete sets, because they provide the framework to compare and classify the complexity of search problems for NP non-complete sets. On complexity of search problems for NP-complete sets, the most important hypothesis is that 'SAT is polynodrial-time non-adaptively search reducible to decision'. Since it is impossible to solve this hypothesis absolutely using our current proof techniques, the relationships between this hypothesis and other hypotheses about structural properties of SAT are studied in this thesis. It is proved that if 'sparse NP complete sets under polynomiaLtime Turing reductions exist, then'SAT is polynomialtime non-adaptively search reducible to decision', and that if 'P is not equal to NP' then either 'SAT is not polynomial-time non-adaptively search reducible to decision' or 'SAT is not polynomial-time truth-table reducible to bounded approximable sets', and that if 'P is not equal to NP' then 'sparse complete sets for NP under polynomal-time disjunctive reductions do not exist'. These results have established links between hypotheses about SAT which are hard to solve, which is the best research line that can be hoped under current proof techniques, and have improved our knowledge about structural properties of SAT and about complexity of search problems for NP-complete sets. The results are applicable to all NP-complete sets. The key lemma in proofs of above results is that 'all NP sparse sets are printable by functions computable in polynodrial-time with non-adaptive queries to SAT'. This lemma also improves previous result about the printability of NP sparse sets. The generalizations of selectivity using classes of sets instead of classes of functions are studied also in second part of this thesis. It is proved that these two methods give rise to the same result for some known generalizations of selectivity in literatures. ﹀
论文总页数：	63
参考文献总数：	0
馆藏号：	008/D99(03)
公开日期：	1999-06-01

1999-05-01

汉英机器翻译中的基于实例的转换引擎研究.常宝宝

链接

题名：	汉英机器翻译中的基于实例的转换引擎研究
姓名：	常宝宝
学号：	19508804
论文语种：	汉语
专业：	计算机软件与理论
培养层次：	博士
学位：	理学博士
培养单位：	北京大学
院系：	计算机科学技术系
专业：	计算机软件与理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算机科学技术系
论文答辩日期：	1999-05-01
外文题名：	The research on example-based transfer engine in Chinese-English machine translation
关键词：	基于实例的机器翻译机器翻译计算语言学
论文摘要：	︿长期以来，国际国内机器翻译研究中使用的主流方法是基于规则的方法（RBM），90年代随着经验主义的复苏，以理性主义为哲学基础的基于规则的方法受到质疑和挑战。但是，相关研究也标明，目前经验主义方法同样也难于使得机器翻译技术获得突破性的进步。一般认为，经验主义和理性主义有各自的优点和缺点，基于不同方法部件的混合模型比基于单一方法的模型更能有效增加系统的处理能力。基于实例的机器翻译方法的基本思想是由日本京都大学的长尾真于1984年提出的。这种方法利用存储在实例库中的实例作为翻译的主要知识资源，采用类比的方式完成机器翻译。在基于实例的框架中，不需要显式规则，可以有效地避免开发汉英机器翻译系统时书写规则所带来的一些问题。然而，基于实例的机器翻译系统，要求有大规模的实例，系统的生成能力较弱，处理实例库中没有的语言现象的能力有限。基于以上认识，作者针对传统的转换技术中所存在的问题，重点探讨用基于实例的机器翻译技术对其进行改进的可能性。本文在以下几个方面做了一些探索性的工作。（1）在详细分析基于实例翻译技术的基础上，探索把这一技术用于机器翻译中的一个环节，即转换环节。提出了一个基于双转换引擎的汉英机器翻译模型，以传统转换引擎为主引擎，利用基于实例的技术对转换质量进行了改进和补充。（2）详尽分析了基于实例的翻译技术的生成能力问题，并在分析汉语句子间复杂匹配情况的基础上，提出以语法形式为单位进行实例之间的匹。（3）在考察具有相同中心词的句子的译文规律的基础上，提出了一个关于翻译的中心词假设，并据此提出了一个基于中心词的实例匹配算法。（4）以《同义词词林》为基础，提出了一个覆盖评价机制。一个覆盖的质量可由组成覆盖的各个实例片段的内部相似度、外部相似度以及实例片段的大小加权度量。（5）提出了一个汉英双语实例的构建框架。对实例库中的汉语部分、英语部分进行了句法分析，并在此基础上，探讨了对小规模双语例句集进行词汇、短语一级的对齐问题。﹀
外文摘要：	︿ In the field of Machine Translation, the rule-based approaches are still the ovethelming choice. But empiricism approaches, which revived in the early 90's, are challenging the rule-based approaches. After several years' efforts to build statistical model for Machine Translation in the community of intemational Computational Linguistics, some limitations of empiricism approaches are more and more sticking out. It also could not be expected that empiricism approaches could bring the Machine Translation a breakthrough. Raionalism-based methods and empiricism-based methods have their strong points and shortcomings individually. More and more researchers believe that a hybridization of these approaches could effectively improve the tranSlation quality. Example-Based Machine Translation (EBMT) was proposed by Markoto Nagao of Kyoto University, Japan, in l984, in which the main knowledge resource is an aligned bilingual example set and the translation is carried out by utilizing the principle of analogy In the framework of Example-Based Machine Translation, there is no need to write explicit rules, Which is showed to be difficult in the paradigm of Rule-Based Machine Translation. It also was found that EBMT needs large-scale bilingual example database. Moreover, due to the limitation of the generating capacity of EBMT it can not process the linguistic phenomena allown to example database effectively In this paper, an Example-based Transfer Engine was proposed, Which was integrated in an existing rule-based Chinese-English Machine Translation system and was aimed to resolve problems arising from the traditional transfer technique. The following subjects were discussed in this paper.(l) Based on a detail analysis of EBMT techniques, a model based on double transfer engines was put forward, within which EBMT techniques are applied to the transfer stage in the Chinese-English machine translation and the EBMT transfer engine serves as a supplementary component of the main traditional RBMT transfer engine. (2) The problem of lacking of generative power of EBMT was discussed in very detail, and the complexity of the match between Chinese sentences was also showed in the paper. Chinese syntactic form was proposed as unit to match between Chinese sentences. (3) A hypothesis that all sentence or phrase with similar headword will have similar structured translation was put forward and a matching algorithm or cover-generating algorithm based this hypothesis was presented. (4)Based on thesaurus titled TongYiCiCiLin, a cover-evaluating scheme was proposed. The quality of a cover can be evaluated by internal, external similarity and size of the example fragment involved in the cover. (5) A framework for constructing an example database was discussed in the paper, in which the examples are parsed and aligned in word and phrase level. Aligning techniques of small-scale bilingual corpus were also discussed. ﹀
论文总页数：	90
参考文献总数：	0
馆藏号：	008/D99(02)
公开日期：	1999-05-01

1997-06-01

古诗研究的计算机支持系统和相关的计算语言学课题.沈钢

链接

题名：	古诗研究的计算机支持系统和相关的计算语言学课题
姓名：	沈钢
学号：	19408006
论文语种：	汉语
专业：	计算机科学理论
培养层次：	硕士
学位：	硕士
培养单位：	北京大学
院系：	计算机系
专业：	计算机科学理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算机系
论文答辩日期：	1997-06-01
关键词：	自然语言处理古籍的计算机处理文本联接自动抽词互信息
论文摘要：	︿文摘暂缺﹀
外文摘要：	︿文摘暂缺﹀
论文总页数：	32
参考文献总数：	0
馆藏号：	008/97<33>
公开日期：	1997-06-01

1996-01-01

“古诗研究的计算机支持环境”的设计与实现.刘岩斌

链接

题名：	"古诗研究的计算机支持环境"的设计与实现
姓名：	刘岩斌
学号：	19308009
论文语种：	汉语
专业：	计算机科学理论
培养层次：	硕士
学位：	硕士
培养单位：	北京大学
院系：	计算机系
专业：	计算机科学理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算机系
论文答辩日期：	1996-01-01
关键词：	自然语言处理古籍语料库知识库超文本全文检索属性检索互信息
论文摘要：	︿文摘暂缺﹀
外文摘要：	︿文摘暂缺﹀
论文总页数：	46
参考文献总数：	0
馆藏号：	008/96<07>
公开日期：	1996-01-01

1995-06-01

面向机器翻译的汉语句法规则和自动分析.陶晓明

链接

题名：	面向机器翻译的汉语句法规则和自动分析
姓名：	陶晓明
学号：	19208007
论文语种：	汉语
专业：	计算机理论
培养层次：	硕士
学位：	硕士
培养单位：	北京大学
院系：	计算机系
专业：	计算机理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算机系
论文答辩日期：	1995-06-01
关键词：	汉语句法规则汉语句法分析机器翻译
论文摘要：	︿文摘暂缺﹀
外文摘要：	︿文摘暂缺﹀
论文总页数：	0
参考文献总数：	0
馆藏号：	008/95<05>
公开日期：	1995-06-01

1994-05-01

日汉机器翻译模型系统的实现和词汇功能语法的应用.刘东

链接

题名：	日汉机器翻译模型系统的实现和词汇功能语法的应用
姓名：	刘东
学号：	19108006
论文语种：	汉语
专业：	计算机科学理论
培养层次：	硕士
学位：	硕士
培养单位：	北京大学
院系：	计算机系
专业：	计算机科学理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算机系
论文答辩日期：	1994-05-01
关键词：	自然语言处理机器翻译中间语言
论文摘要：	︿文摘暂缺﹀
外文摘要：	︿文摘暂缺﹀
论文总页数：	0
参考文献总数：	0
馆藏号：	008/94<12>
公开日期：	1994-05-01

1993-06-01

现代汉语语料库多级处理与汉语短语结构分析.周强

链接

题名：	现代汉语语料库多级处理与汉语短语结构分析
姓名：	周强
学号：	19008005
论文语种：	汉语
专业：	计算机科学理论
培养层次：	硕士
学位：	硕士
培养单位：	北京大学
院系：	计算机系
专业：	计算机科学理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算机系
论文答辩日期：	1993-06-01
关键词：	现代汉语语料库多级处理汉语短语结构
论文摘要：	︿文摘暂缺﹀
外文摘要：	︿文摘暂缺﹀
论文总页数：	0
参考文献总数：	0
馆藏号：	008/93<05>
公开日期：	1993-06-01

1991-06-01

一个定点驱动的双向句法分析器.毛少伟

链接

题名：	一个定点驱动的双向句法分析器
姓名：	毛少伟
学号：	18808009
论文语种：	汉语
专业：	计算机科学理论
培养层次：	硕士
学位：	硕士
培养单位：	北京大学
院系：	计算机系
专业：	计算机科学理论
导师1姓名：	俞士汶
导师1单位：	北京大学计算机系
论文答辩日期：	1991-06-01
关键词：	定点驱动双向句法分析器
论文摘要：	︿文摘暂缺﹀
外文摘要：	︿文摘暂缺﹀
论文总页数：	0
参考文献总数：	0
馆藏号：	008/91<08>
公开日期：	1991-06-01

1990-06-01

日汉机器翻译的研究和模型系统的实现.陈华

链接

题名：	日汉机器翻译的研究和模型系统的实现
姓名：	陈华
学号：	18708022
论文语种：	汉语
专业：	计算机应用
培养层次：	硕士
学位：	硕士
培养单位：	北京大学
院系：	计算机系
专业：	计算机应用
导师1姓名：	俞士汶
导师1单位：	北京大学计算机系
论文答辩日期：	1990-06-01
关键词：	日汉机器翻译模型系统的实现
论文摘要：	︿文摘暂缺﹀
外文摘要：	︿文摘暂缺﹀
论文总页数：	0
参考文献总数：	0
馆藏号：	008/90<33>
公开日期：	1990-06-01

2011-11-28

日语三字词的探究——以报纸语料为例.赵玲莉

链接

题名：	日语三字词的探究——以报纸语料为例
姓名：	赵玲莉
学号：	10817505
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-11-28
外文题名：	The research of three-character phrase in news corpus
关键词：	三字词三字词抽取组合模式日汉词汇对比
外文关键词：	Three-character Phrase extraction of three-character phrases Combination mode Phrase Comparison
论文摘要：	︿三字词语是现代日语词汇系统中极为重要的一类,对于三字词语的研究与分析，有助于我们对现代日语词汇作更为全面细致的了解。本文对近百万字的报纸语料中出现的三字词进行了定量统计与定性分析。本研究使用基于规则的方法对三字词进行了自动抽取，利用基于统计的方法进行进一步删选，最终人工进行分析共抽取出1786个三字词。本人在这些研究中检验了前人发现的一些理论或规律，并在研究中提出了自己的思考，也进行了实际的验证。文章分析得出了新闻中三字词在构成要素、构成方式和词性上的特点，对报纸语料中出现的三字词的感情色彩和词族现象进行了分析，并对比分析了日汉二字词和不同时期日汉三字词的异同，通过梳理归纳其他人的三字词研究，对比分析自己的研究成果，对三字词历时的比较很好地勾勒出三字词发展变化轨迹。衷心希望本研究能为近代日语词汇的深入研究和日汉三字词的对比研究提供丰富的语料和理论参考。本论文的主要内容由以下七部分组成：第一、介绍日语三字词的产生及三字汉字词的研究现状、研究的内容及研究的意义。第二、介绍了本次研究三字词的界定及抽取。第三、分析新闻中三字词组合模式上的特点，对比日汉二字词及不同时期三字词的在构成要素和构成方式上的异同。第四、分析报纸语料中三字词的词性分布、感情色彩、词族现象及词性上的特点。第五、总结本研究的结论，并提出了对未来工作的展望。﹀
外文摘要：	︿ Three-character phrase is an important type of homographs in Japanese vocabulary. The research and analysis on these phrases will give an insight into Japanese vocabulary. The thesis will launch an exhaustively qualitative and quantitative analysis on million word corpus .This study selected three-words automatically by using the method based on the specific rule,then made further selection according to statistics and artificial analysis,finally Filter out 1786 three words. During this research, the author tested some of the theories or laws found in previous studies, and also proposed my own viewpoint and carried on verification. This paper analyzed and summerized the characteristics of the three-character phrase on combination mode and part of speech, and analyzed the emotional color and word families phenomenon of the three-character phrase appeared in news corpus, and comparatively analyzed difference between Japanese and Chinese two-character phrases, and also the similarities and differences of the three-character phrases in different ages.By summarizing other studies on three-character phrase, and comparing them with my own study, which can give a good outline of three-character phrase’s development trajectory.Sincerely hope that this study vocabulary for modern Japanese and Japanese-Chinese-depth study of homographs with comparative study of the corpus and provide a rich theoretical reference.The main contents of the paper can be divided into the following seven parts: the first part is to introduce the generation of Japanese three-character phrase and its research situation, the contents and significance of the study. Second, the definition and extraction of three-character phrases. Third, the analysis of the combination model features of three-character phrases in the news, comparing the similarities and differences of the two-character phrases and three-character phrases in different ages. Fourth, the analysis of the distribution,the emotional color, the word family phenomenon and the speech features of three-character phrases. Fifth, the summary and conclusions of this study, and proposed prospects for future study. ﹀
分类号：	H087/TP311.5
论文总页数：	60
参考文献总数：	49
馆藏号：	017/M2011(684)
公开日期：	2011-11-28

手机行业基于用户行为的意图发现.刘先兵

链接

题名：	手机行业基于用户行为的意图发现
姓名：	刘先兵
学号：	10817248
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	王厚峰
导师2单位：	信息科学技术学院
论文答辩日期：	2011-11-28
关键词：	用户意图分析数据挖掘行为分析推荐系统
论文摘要：	︿当前信息科技、电子商务领域的快速发展给商家不但带来了直接接触买家的机会，而且也让买家能够足不出户就可以接触到很多的卖家。不过这虽然给买家带来了更多的选择机会，但也给他们带来了如何快速地挑选自己中意商品这一难题。本文要解决的问题就是如何让卖家快速地将买家可能感兴趣的商品摆在买家的面前，从而提高店铺的销量并提升买家的购物体验。本论文创新性地将用户的意图定义为用户希望购买的商品所具有的特征，并通过用户的行为来分析出购物意图。这样用户在进入一家店铺时，根据计算出来的用户意图与店铺各商品所具有的功能和特征的匹配程度计算出一个分值，然后挑选出分值较高的商品推荐给用户。这样不仅可以提升店铺的销售额，而且可以方便买家快速地挑选到自己满意的商品，促进了买卖双方的共赢。笔者依托于实习公司淘宝的海量数据以及Hadoop 分布式平台的计算能力，对手机行业的用户行为进行分析，采用Logistic Regression 模型来对用户行为到购买商品特征进行建模，以此来挖掘出用户行为和用户最终购买商品所具有特征之间的关系。同时笔者还利用Learning to Ranking 的模型对店铺内的商品和用户兴趣特征之间的关系建立出一个排序函数，并最终将用户最感兴趣的商品推荐给买家。最终笔者完成了模型的建立，并且通过此挖掘过程，证明了用户的行为能够很好地预测出用户购买商品的特征并预测出店铺内感兴趣的商品。而且工程上，完成了整个系统的开发，并且该系统性能上能够满足整个手机行业的推荐需求。﹀
分类号：	TP311.13/TP274
论文总页数：	57
参考文献总数：	22
馆藏号：	017/M2011(669)
公开日期：	2011-11-28

构建面向用户的软件本地化翻译质量评估体系.李晓英

链接

题名：	构建面向用户的软件本地化翻译质量评估体系
姓名：	李晓英
学号：	10817218
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-11-28
关键词：	软件本地化翻译质量评估评估指标指标权重 Delphi 模糊层次分析用户体验
论文摘要：	︿质量是企业生存之本。在软件本地化业务范畴逐渐延伸的大环境下，为了能够在本地化行业有立足之地，各软件本地化企业都在乐此不疲地寻求能够提高本地化质量的“妙方”。统计发现，当前大多数语言服务企业都提倡“以客户为中心”的原则。大到企业质量管理策略，小到本地化生产中的质量控制和质量保证步骤，客户要求及其制定的评估标准都是本地化生产和质量保证的关键依据。客户需求固然重要，而软件用户的体验和感受也是不可忽视的一部分。软件用户是软件产品设计和开发的目标受众，他们对产品的感受和反应是产品性价比在市场上的“温度计”。同理，软件用户对软件产品本地化翻译的反应，也是评价软件本地化企业本地化能力的重要依据。基于此，本研究旨在提出并构建了一种面向用户的软件本地化翻译质量评估体系，旨在让软件用户充当“语言专家”的角色，对译文的准确性、文理通达性、完整性、规范性等方面进行评估，以此收集用户对软件本地化翻译质量的感受和评价，帮助软件本地化企业发现工作的不足，找到有效的提高本地化翻译质量的方法。该研究在文献参考和实际应用的基础上，拟定出适用于软件用户的评估指标；为了证明拟定指标的合理性和可行性，本研究采用Delphi专家调查法争取各方面专家的意见和建议，最终收敛评估指标各层面和准则；为了能够将该评估体系的评估结果量化，本研究采用基于模糊一致性矩阵的模糊层次分析法计算各评估指标的权重值；通过实际案例实践，本研究分析了该评估体系的可靠性和有效性；最后，本文对该评估体系的实际应用提出了相关建议，引导软件本地化企业更有效地利用该评估体系。此外，在回顾总结本研究主要内容的同时，笔者阐述了本研究尚且存在的不足，并给出下一步研究的方向。对于语言服务行业来说，本研究有非常重要的借鉴意义和使用价值。首先，本研究突出了软件用户的评估在软件本地化过程中作用和影响；其次，为软件用户评估软件本地化翻译质量提供了一个完整的、系统的框架；最后，为软件本地化企业在本地化翻译质量管理策略和方法上开辟了一个新的方向。﹀
分类号：	TP311.56/H085
论文总页数：	94
参考文献总数：	53
馆藏号：	017/M2011(666)
公开日期：	2011-11-28

视频台标检测技术的研究与实现.李铁

链接

题名：	视频台标检测技术的研究与实现
姓名：	李铁
学号：	10917260
论文语种：	chi
专业：	软件工作
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-11-28
关键词：	视频台标检测 Haar特征 Adaboost算法全局特征 SURF特征 SVM算法
论文摘要：	︿随着大量视频信息涌入人们的现实生活，视频台标检测作为对视频来源分析的一个有效手段。通过视频的台标，可以相对容易的确定视频的发布者，通过节目中的标识又能定位到具体的节目。通过这些重要语义信息，可以提供精确的视频搜索。此外，通过检测视频节目中的台标可以去除广告片段（国外很多电视节目广告片段中不含有台标），提高观赏性。同时，在视频安全领域，视频台标检测技术可以有效的确定视频来源，为过滤固定电视台的节目提供了自动监测手段。针对目前台标检测的实际需求，本文提出了两种方法对台标检测提出了分析和实现。第一方面的工作是对台标的定位，本文分别使用了背景差法和滑动窗口两种方法来实现对台标的定位。第二个方面，也是本文的主体，对台标的检测。首先，本文使用了经典的Haar特征作为基本特征，并且采用适应性强的Adaboost学习算法对以上特征进行训练，得到台标检测的模型。为了测试算法的性能，本文构建了一个专门用于台标检测的数据集，包含15种台标，约2500个互联网视频。在这个数据集上，本文测试了第一个算法获得了一个初始的查全率和查准率作为以后实验性能的基准值，并且分析了该算法的优势和存在的一些问题。针对第一个模型存在的问题，本文又提出了融合多特征的台标检测方法。基于目前计算机视觉领域常用的目标检测的视觉特征，融合多特征的台标检测方法提取了台标的全局特征（颜色特征、纹理特征和梯度特征）和局部特征（SURF特征），并通过SVM模型将这些特征融合，训练得到一个完整的台标检测模型。经过在台标数据集上的实验表明，该算法将台标检测的查准率提高到了95%以上，同时保证了65%以上的查全率。测试结果证明了本文所选取的特征是有效的并且融合多特征的台标检测方法能够满足台标检测的要求。﹀
分类号：	TN911
论文总页数：	66
参考文献总数：	35
馆藏号：	017/M2011(700)
公开日期：	2011-11-28

2011-11-24

激励制度、工作特性与本地化工程师离职倾向的关系研究——以H本地化公司为例.孙鹏亮

链接

题名：	激励制度、工作特性与本地化工程师离职倾向的关系研究——以H本地化公司为例
姓名：	孙鹏亮
学号：	10817320
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-11-24
关键词：	本地化工程师激励制度工作特性工作满意度离职倾向
论文摘要：	︿目前多数本地化公司的工程师大多是学习或从事其他行业的人员，经过在本地化项目中逐渐积累经验，最终成为合格的本地化工程师。在这期间，企业和员工都付出了很大代价。目前本地化工程师离职率较高，给企业带来了较大不良影响。因此，如何有效避免工程师离职成为了本地化企业人力资源管理方面一个重要课题。本研究通过深度访谈与问卷调查相结合的方式，提出本地化工程师激励-工作特性-离职倾向关系理论模型，以H公司为例研究激励制度和工作特性对本地化工程师离职倾向的影响。结果显示本地化工程师对企业激励制度的满意度较低，其中对奖金最不满意；对本地化工程工作特性满意度一般，最不满意的是弹性工作制和自主创新性；对于激励制度和工作特性的满意度与其工作满意度显著正向相关，并经由工作满意度对离职倾向具有显著负向影响，内在满意度对离职倾向的预测能力更强。根据研究结果，本研究建议：建立合理的薪酬奖金制度，提高工程师工作热情；注意提高员工专业技能，完善培训和晋升机制，保证机制的公平性；设计适合的本地化项目的工作特性，保证工作的多样性和自主性；注重提高女性员工、未婚员工和高学历员工的工作满意度，降低离职倾向。对后续研究者的建议为：对更多的激励制度进行相关研究；影响离职倾向的因素有很多，可将本文未涉及到的变量纳入研究范围；个人属性不仅包括人口统计变量，还包括个人特质，建议将研究范围放宽等。﹀
分类号：	F272.92
论文总页数：	77
参考文献总数：	0
馆藏号：	017/M2011(671)
公开日期：	2011-11-24

2011-06-04

社交网络用户网页基于语义排序的关键词抽取技术.李春旭

链接

题名：	社交网络用户网页基于语义排序的关键词抽取技术
姓名：	李春旭
学号：	10917238
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-04
关键词：	关键词抽取社交网络数据透视数据挖掘用户特征
论文摘要：	︿数据挖掘(Data Mining)是从存放在数据库，数据仓库或其他信息来源如互联网网页中的大量的数据中获取有效的、新颖的、潜在有用的、最终可理解的模式的非平凡过程。本论文介绍一种关键词抽取技术，它属于数据挖掘技术的一种，通过关键词的形式挖掘出一篇文档中最重要的特征，本文赋予了这种技术一个真实的情景，在文中这篇“文档”即是社交网络中某位用户的个人主页。本文从真实数据出发，介绍一种新颖有效的文档关键词抽取技术,自动挖掘出社交网络中用户的主要特征。在大量纯文本非结构化信息以及遵循xml标准网页半结构化信息中自动发现数据的维度特征在数据挖掘中具有相当难度，而且准确性难以保证。本文侧重于解决社交网络用户主页文本中关键词的抽取。这些关键词代表了用户从分享的文章，状态和日志中体现出的关注和喜好。本文通过一种有效的语义排序算法，对文本中的关键词以多维向量的形进行语义匹配，从文中找出可以代表文本主要内容的关键特征词，这些词可以做为用户特征，生成数据透视信息。本文的工程部分是将挖掘出的数据用数据透视工具呈现出来。数据透视可将数据分组，分解，过滤和比较，以一种更直观可操作的形式呈现给用户和研究人员。进而方便用户及研究人员对数据进行统计，筛选，发现数据之间的内在联系，获得研究成果。此外, 本网页关键词抽取技术也已经在微软中国广告技术中心网页广告组使用，并以本文作者为第一作者申报相关专利(英文版)。﹀
外文摘要：	︿ Data mining, a branch of computer science, is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Data mining is seen as an increasingly important tool by modern business to transform data into business intelligence giving an informational advantage. The paper introduces a keyword extraction algorithm. The algorithm extracts a list of keywords from a document, which represents the primary content of it. In the paper, the document is the user’s portal from social networks, and the algorithm will automatically discover the keyword features of the social network users.In large scale unstructured text and semi-structured xml web pages, auto extract the keyword features is a challenge. In this paper, we use an efficient semantic ranking algorithm; rank the keyword candidates from a user’s portal; get the top features of the users to show in pivot table. The data in the pivot table can be used by users and researchers to easily and intuitively mining the relationship among social network users.The innovations of this paper are mainly from two perspectives. From the algorithm perspective, a new semantic ranking keyword extraction algorithm is introduced. From the engineering perspective, a new pivot table tool is implemented in order to intuitively explore the pivot data and the tool also has the flexibility to do filtering, comparison, and ranking.The tool is built with WPF and Silverlight technics. It provides the researchers and tool users the capability to discover more valuable information in sales, communications and health care fields.In additional, the program is used in Microsoft Ad Platform China for Behavior Targeting team internal use to analysis the ad-click users’ behavior. ﹀
分类号：	TP311.13/H087
论文总页数：	59
参考文献总数：	22
馆藏号：	017/M2011(338)
公开日期：	2011-06-04

2011-06-03

基于语言模型的汉英多词表达互译对自动提取研究.朱莎莎

链接

题名：	基于语言模型的汉英多词表达互译对自动提取研究
姓名：	朱莎莎
学号：	10917610
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-03
外文题名：	Research on Automatic Extration of Chinese English Translation Equivalents based on Target Language Model
关键词：	多词表达互译对语言模型对数似然互信息语义相似度
外文关键词：	multiword expressions language model loglikelihood pointwise mutual information semantic simialrity
论文摘要：	︿多词表达互译对的自动提取研究是计算语言学领域的一个重要研究课题，其对机器翻译、人工翻译等领域有着重要的影响。本文以1991年到2010年中国政府的白皮书的中英文平行语料为基础，运行GIZA++工具生成候选的二元和三元汉英互译对。在此基础上，本文利用如下语言模型对候选汉英短语互译对进行自动识别研究：（1）采用对数似然以及互信息方法计算英语二元组和三元组模型本文利用对数似然以及互信息方法计算英语二元组和三元组中词语之间的关联度，依据给定的关联度阈值对候选二元汉英互译对和三元汉英互译对进行过滤，提高二元多词表达互译对和三元多词表达互译对的准确率。（2）利用依存句法分析工具获取依存关系对中心词本文通过Stanford大学的句法分析工具获取句子依存关系对，确定依存关系对中的中心词，为下一步基于相似度方法提取多词表达互译对提供支持。（3）基于相似度方法扩展多词表达本文提出基于语义相似度计算方法来扩大多词表达互译对提取范围。利用步骤（1）中提取出的二元组和三元组作为种子，根据依存关系对提取种子二元组和三元组中心词语以及候选二元和三元组中心词语，计算中心词语之间的相似度，取相似度值大于设定阈值的候选二元组和三元组，扩大其覆盖范围。（4）基于Web翻译提取汉英互译对虽然第（3）步能够扩大英语二元组和三元组的覆盖范围，但有些二元组和三元组没有对应的中文多词表达。为此，本文把扩展后的英语二元组和三元组作为翻译源，使用Google 翻译引擎生成中文译文，并根据编辑距离方法计算中文译文与标准答案中中文之间的距离，依据距离大小选择合格互译对。本文的研究方法不仅利用了GIZA++工具在提取候选汉英互译对准确率高的特点，而且采用语言模型的方法降低了汉英互译对的噪音。实验结果表明使用本文方法能够提高多词表达互译对的准确率和召回率。﹀
外文摘要：	︿ Automatic extraction of multiword expression (MWE) translation equivalents is an important task in computational linguistics. It exercises great importance over machine translation (MT), human translation and the other domains. Based upon the Chinese-English parallel corpus of Chinese government white papers from 1991 to 2010, this paper runs the GIZA++ tool to extract bigram and trigram translation equivalents, and conducts the following researches on extracting MWE transltion equivalents.（1）Use loglikelihood and pointwise mutual information technique to generate the bigram/trigram model This paper uses the loglikelihood and pointwise mutual information technique to measure the affinity between words in the bigram/trigram and then filter the candidate translation equivalents according to the word affinities, thus improving the accuracy of bigram/trigram translation equivalents. （2）Parse the sentence dependency relation to get head words This paper uses the Stanford dependency parser to generate the dependency pairs and extract the head words in these dependency pairs. These head words are used the third step for similarity computation.（3）Extend MWE coverage based on the similarity approach This paper continues its research on the above experiements, and uses the multiword expressions extracted in (1) as seeds. Stanford dependency parser is employed to get the head words of the seeds and those of candidate multiword expressions. Afterwards, similarities between head words of the seeds and those of candidates are calculated. By doing so, MWE coverage can be enlarged based on the similarities, thus resolving the problem of statistical tool’s inabilility in finding low-frequency MWE.（4）Extract MWE translation equivalents based on Web translation MWE coverage is extended in (3). Its corresponding Chinese translations are not provided in the extended set. Hence, this paper takes the extended English MWE as translation sources, and sends these sources to Google translation engine for Chinese translations. After getting these Chinese translations, this paper compares the difference between Chinese translations returned by Google translation engine and those in the answer set by means of the edit distance.To sum up, this paper makes use of GIZA++’s capability in high precision of extracting candidate Chinese-English translation equivalents, and also uses the language model to reduce the noises in the candidate translation equivalents. The experimental results show that the precision and recall achieved in this paper are improved.Key Word：multiword expressions (MWE), language model, loglikelihood, pointwise mutual information, semantic simialrity ﹀
分类号：	TP391/H059
论文总页数：	65
参考文献总数：	21
馆藏号：	017/M2011(399)
公开日期：	2011-06-03

中文网页褒贬评价系统的设计与实现.董洁

链接

题名：	中文网页褒贬评价系统的设计与实现
姓名：	董洁
学号：	10817106
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-03
关键词：	情感倾向卡方统计朴素贝叶斯语义分析
论文摘要：	︿随着网民对网络新闻认可度的提高，越来越多的人浏览网络新闻并进行相关评价。因此，网络新闻对企业，品牌，事件和人物的影响起着重大作用，甚至成为企业或品牌影响力扩张的基石。通过分析相关新闻报道，企业、组织或个人可以掌握大众的情感倾向，不断完善自身，进而更好的发展。本文以朴素贝叶斯方法为基础，提出了改进的朴素贝叶斯方法以及基于统计的语义分析方法用于情感分析，通过比较，发现基于统计的语义分析方法能够获得较好的效果。作者结合Hownet情感分析用词，从大量语料中提取常用的情感词汇及词汇组合扩充情感特征库，根据特征项的语义倾向构建情感分析模型。充分考虑网页的结构信息，综合考虑多种影响网页褒贬倾向的因素，对网页进行情感倾向性分析，提高褒义网页和贬义网页的准确率和召回率，在此基础上设计并实现一个基于搜索引擎对查询关键词的新闻网页进行褒贬评价的系统。本文采用的情感分析方法以及搜索引擎对查询词的整体褒贬评价方法，对其它情感分析系统具有一定的参考价值和实用价值。﹀
外文摘要：	︿ With people’s acceptance of network news, they increasingly make comments on news reports through search engine, which plays an important part in the influence of enterprises, products, events, and personages. The comments even become their cornerstone for expansion. Through analyzing the emotional tendency of news reports, enterprises, organizations, or individuals can grasp the first hand information and make further development.Based on the theory of Naïve Bayes, the author raised the improved Naïve Bayes and semantic analysis based on statistics to analyze the emotions in news reports. After comparison, the author found that semantic analysis based on statistics could achieve better effect. Extracting commonly used emotional words and emotional expressions from the sentimental corpus and combining with the Hownet emotional words, the writer constructed an emotional analysis model based on the semantic tendencies of characteristic items. After taking the structural information of web pages and the elements into account, which influence the emotional tendencies of the news reports, the methods improve the accuracy and recall of both the commendatory texts and derogatory texts, and then designed and realized a system. The system can make comments on the web news which readers can find out through key words with the help of a search engine. The emotional analysis and the overall comment on key words through search engines studied in this thesis have great reference and practical value to other emotional analysis system. ﹀
分类号：	TP311.56
论文总页数：	75
参考文献总数：	43
馆藏号：	017/M2011(061)
公开日期：	2011-06-03

翻译公司品牌建设模式研究.赵凰吕

链接

题名：	翻译公司品牌建设模式研究
姓名：	赵凰吕
学号：	10917582
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-03
外文题名：	A Model Study of Brand Building for Translation Companies
关键词：	翻译公司顾客满意感知质量品牌形象
外文关键词：	Translation Company Customer Satisfaction Brand Image
论文摘要：	︿翻译行业作为语言服务业的一个重要组成部分，近年来得到迅猛发展，但同时也面临着翻译市场的激烈竞争和较低的社会认同等困境，翻译公司要想获取竞争优势、实现突围，则必须全力打造自己的特色品牌。本文在吸收品牌学及翻译行业相关研究成果的基础上，试图从顾客满意的角度来探讨翻译公司的品牌建设策略，以文献研究和实证研究相结合的方法，构建了翻译公司品牌建设的基本模型，探讨了顾客满意度各种影响因素之间的关系及其对顾客满意度的作用机制。论文的研究工作主要包括：1.在文献研究的基础上，分析顾客满意的一般形成要素，从理论层面提出与其相关的翻译公司品牌建设影响因素的基本模型；2.本文还以中国对外翻译出版公司为案例加以说明和分析，为品牌建设研究模型构建的提供实证材料；3.以翻译公司顾客为具体调查对象，通过问卷调查和深入访谈调研方式来验证模型。研究结果表明：1.顾客感知质量各要素对翻译公司品牌形象都存在积极影响，而翻译公司品牌形象直接正面影响顾客满意度；2.不同顾客期望对顾客感知翻译质量有一定影响；3.整合顾客感知质量、翻译公司品牌形象以及顾客整体满意度的翻译公司品牌建设影响因素研究模型，对于企业来讲，最直接的意义在于能够重新认识提高顾客满意度的途径。﹀
外文摘要：	︿ Translation industry, as the important part of Language Services, is facing intense competition and low social recognition while having rapid development in recent years. Translation companies should build their own brands to obtain the competitive advantages. Based on previous research results on translation industry and brand theories on customer satisfaction index (CSI) by combining theoretical analysis with empirical research, this thesis, probes into the relationships among different CSI factors and these factors’ influence on translation company customer, constructs the basic model of translation company brand building. The work of this thesis mainly includes three aspects as follows: (1) Based on the various previous paper and academic analysis，this paper analyzes the common forming model of customer satisfaction, and puts forward the corresponding factor model of translation company brand building; (2) This paper performs a survey among customers of Chinese Translation and Publishing Company; (3) Interviews and questionnaires are used to test the model. Through the above research work, the main conclusions are as follows:(1) Factors of customer perception has positive influence on translation company brand image and translation company brand image also directly affects customer satisfaction; (2) The difference of customers’ expectations will influence their perception towards the quality of translation; (3) This thesis gives detailed suggestions on translation company brand building based on the model. ﹀
分类号：	H059/F713.50
论文总页数：	86
参考文献总数：	53
馆藏号：	017/M2011(395)
公开日期：	2011-06-03

翻译企业管理优化及商业智能研究.李坤

链接

题名：	翻译企业管理优化及商业智能研究
姓名：	李坤
学号：	10917245
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-03
关键词：	翻译管理翻译管理系统商业智能数据仓库联机分析处理
论文摘要：	︿中国加入世贸组织为翻译行业带来了新的发展契机，近年来北京奥运会、上海世博会、广州亚运会等一连串世界顶级赛事、国际会议和活动更是在中国频繁地召开，这将翻译行业推向了发展高潮。然而，世界带给我们机遇的同时也带来了挑战。中国的翻译企业良莠不齐，虽然取得了一定程度的发展，但在国外顶级语言服务企业面前竞争力明显不足。在信息化时代背景下，翻译企业如何改进翻译管理并提升服务质量成为了该行业最重要的话题之一。本文首先介绍了国内翻译行业发展背景，在此背景下对两个翻译服务机构进行研究，总结了信息时代我国翻译企业面临的两类翻译管理问题：翻译资源管理与资源调配问题和业务历史数据分析问题。其次，本文以笔者所在翻译工作室的翻译管理实践为例，证明了使用翻译管理解决第一类问题的优越性；还对工作室使用的开源系统进行了改进，一方面提升了其数据分析能力，另一方面为国内翻译企业应用打下基础。再次，本文以国际翻译公司W为例，说明翻译企业在历史业务数据分析和支持业务决策方面，需要借助IT领域的商业智能解决办法。文章顺势介绍了商业智能核心技术：数据仓库、联机分析处理和维度建模。最后，本文针对W公司面临的第二类问题，提出了基于翻译管理系统的商业智能解决方案，使用MySQL、Kettle、Mondrian和OpenI等开源商业智能工具实现了相应的设计与实现。通过实际的查询分析过程，证明了使用商业智能解决第二类翻译管理问题的优势。﹀
分类号：	F270.7/H059
论文总页数：	110
参考文献总数：	0
馆藏号：	017/M2011(341)
公开日期：	2011-06-03

虚拟物品个性化推荐系统设计与实现.李波

链接

题名：	虚拟物品个性化推荐系统设计与实现
姓名：	李波
学号：	10917234
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-03
关键词：	虚拟物品个性化推荐系统游戏可用性博弈论
论文摘要：	︿本文讨论了虚拟物品购买动机理论，游戏性理论并创造性的引入了博弈论，研究“基于规则的个性化虚拟物品”推荐问题并进行了验证。理论研究试图解决在游戏中提升平衡性、提高用户活跃度、避免信息过载的问题，通过在笔者参与并主持设计的一款名为“Gangues”的黑帮对战类社交游戏中，将理论研究与系统程序开发相结合，不但进行了模拟验证，还对游戏在大客户数的情况下的优化问题进行了讨论。笔者在太能沃可网络科技有限公司的实习工作中，带领5人团队开发了多个社交游戏，黑帮对战游戏“Gangues”正是其中一款。该游戏在巴西市场表现优异，两年来已经成为当地最流行的对战类社交游戏之一，游戏日活跃人数可达8万。多数研究都认为在电子商城中，一个好的商品推荐算法能够帮助电子商城获得更多的订单。但在我们开发的这种在线游戏在虚拟社区中，虚拟物品的推荐算法研究则基本处于空缺状态。虽有学者将使用在电子商城的推荐算法移植到了虚拟世界/虚拟社区中，但忽视了游戏的本质是一系列规则的集合。和现实生活不同，虚拟世界的运行依赖于游戏开发者预定义的规则，玩家是规则的执行者。推荐系统必须依从虚拟社区的规则集合，并促进社区的繁荣，而非简单的增加“赢利”。基于对游戏本质的认识，同时借鉴了虚拟物品购买动机理论，即虚拟物品购买的主要三个动因：功能、快乐和社交因素，本文提出使用规则来进行虚拟物品个性化推荐，而研究的对象主要是那些功能性的虚拟对象——为虚拟物品中主要满足功能性的道具。通过研究游戏性理论，“基于规则的虚拟物品个性化推荐”的主要目标在于改善游戏动态平衡性，从而提高游戏活跃度和粘度。在如何进行推荐规则的制定方面，本文依托“博弈论”理论。帮助玩家从非完全信息博弈转化成“完全信息博弈”并做出最佳决策。这些理论为“基于规则的虚拟物品推荐”研究提供了理论支持。﹀
论文总页数：	65
参考文献总数：	0
馆藏号：	017/M2011(893)
公开日期：	2011-06-03

基于AHP的供应商评价模型及其应用研究.张望

链接

题名：	基于AHP的供应商评价模型及其应用研究
姓名：	张望
学号：	10817481
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-03
关键词：	供应商评价判断矩阵特征值特征向量
论文摘要：	︿自20世纪90年代以来，随着经济全球化的到来和市场竞争的加剧，供应商管理越来越受到理论界和企业界的重视。越来越多的企业开始加强与供应商之间的合作，尤其是与战略供应商保持密切合作关系。因此，如何选择适合自己企业发展需求的、有良好发展前景的供应商，成为企业发展一个至关重要的因素。如何选择最合适的供应商，采用何种方法来对候选供应商进行评价是本文主要讨论的问题。本文首先介绍了论文研究的背景及论文研究的现实意义，指出了供应商评价的目标，并对现有供应商评价方法及其优缺点展开了分析，同时指出自己选择供应商评价方法的理由。然后，作者简要阐述了实用决策的数学模型——层次分析法（Analytic Hierarchy Process，简称AHP），对它的基本概念和主要步骤进行了综述。为让层次分析法适用于供应商评价体系，且更加科学、合理的利用供应商所提供的定量数据，作者对该数学模型进行了改进，使其能够从定性和定量两个角度对供应商展开评价，同时，作者提出了一种全新的判断矩阵校正方法，当矩阵不满足一致性时，可以使用作者的方法进行校正。最后，通过系统设计A公司对油套管供应商进行评价的完整过程，作者证实了层次分析法的可行性和实用性，并将其与A公司传统的供应商评价方式进行了对比，证明了改进后的层次分析法在使用过程中所体现出来的优势。同时，作者也归纳了层次分析法在使用过程中出现的各种问题，并且针对每个问题，提出相应的改进方案，来完善供应商评价体系。﹀
分类号：	F259.23
论文总页数：	63
参考文献总数：	33
馆藏号：	017/M2011(258)
公开日期：	2011-06-03

2011-06-02

读者主体视角下国内技术文档写作与翻译研究.朱鹏飞

链接

题名：	读者主体视角下国内技术文档写作与翻译研究
姓名：	朱鹏飞
学号：	10817530
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-02
关键词：	读者主体技术文档翻译共同对等物翻译评价
论文摘要：	︿综观目前的翻译界，科技翻译是其中相当重要的一大块，“科学技术是第一生产力”，综合国力的比拼说到底是科技实力的比拼，而信息科技作为科技领域的领头羊，更是受到各国的额外重视。这就引出了本文讨论的领域——技术文档（Technical Documentation）翻译，技术文档翻译就其本质而言，其实是一种跨文化交际行为，其目标就在于给译入语读者传达信息，且译入语读者获得的信息应与源语言读者获得的一致，这就是理想的技术文档翻译。但是，技术文档翻译的译入语读者并不单单是被动的信息接受者，同时也是一个积极的参与者与互动者，他们会对翻译的文本做出反应。技术文档的写作目的，就是为产品使用者提供信息或帮助服务，让使用者更快捷、更舒心地使用产品，因此技术文档必须本着译入语读者为中心的原则，而不是以公司为中心。但是，读者是一个笼统的群体，从专业背景到使用经验等，都有所不同，因此他们的阅读习惯、兴趣与接受能力都不同，而目前国内技术文档的写作与翻译都恰恰忽视了这一点，没有对读者群体做很好的分析与分类，采取不同的写作与翻译手段。因此，一方面公司下大力气去写作、翻译技术文档，另一方面却又没人看，这一冷一热反映了目前技术文档界的尴尬。另外，虽然目前技术文档领域的写作与翻译流程已经非常完善，但是作者经过分析之后，认为仍缺乏非常重要的一环——翻译评价。本论文以奈达的读者反应论为评价主体视角，并通过调查调查问卷的形式，撷取当前技术文档使用现状的缩影，佐以理论，深入分析目前技术文档使用现状背后所隐藏的文化心态与理论成因，联系接受理论与传播学理论，指出其症结所在，并给出具体的翻译建议，帮助建设一个更加合理，更能满足读者需求的技术文档翻译规范，并通过分析统计结果，构建一个读者主体视角的技术文档翻译评价体系。技术文档的翻译不仅仅是一个字面的翻译，还应当深入调查用户体验，分析译入语用户的使用习惯等，真正做到本地化，而不是仅仅翻译字面意思。﹀
外文摘要：	︿ Science and technology translation is a very important part of the current translation industry, "Science and technology are primary productive forces", as a leader within the technology field, the information technology, is paid additional attention by all the countries. This leads to the area discussed in this paper – technical translation. By its very nature, technical translation is a kind of cross-cultural communication. It aims to convey information to the readers of target language, and the information which the target readers obtained should be identical with the readers of source language, this is ideal technical translation. However, the target readers are also an active participant of the translation, who will respond to the translated text. The purpose of technical documentation is to provide information or assistance services for the user of the product, so technical documentation must be cored around the target readers, rather than the company. This is consistent with the readers’ response theory of Eugene Nida. However, the reader is a general group, their reading habits, interest and receptivity are different, which have been ignored currently, failed to analyze and classify the reader groups. In addition, although the process of technical writing and translation has been very well, but after a thorough analysis, we will find that there is still lack of a very important part——translation criticism.In this thesis, the reader-response theory is used as the main theoretical support. Through the questionnaire form, the author retrieved the current application statement of the technical documentation. Combined with several related theories, the author also analyzed the hidden cultural psychology and the causes behind this, pointed out the sticking point, and gave some specific translation recommendations. By analyzing the statistical results, we can build a reader-oriented technical translation criticism system. Technical translation is not just a literal translation, and it should also be taken into account the cultural characteristics of the target language, in line with the cultural needs of the target language and the user experience, including in-depth investigation, analysis of the target language user's habits, which contribute to true localization, not just literal translation. ﹀
分类号：	H085/TP334
论文总页数：	87
参考文献总数：	46
馆藏号：	017/M2011(281)
公开日期：	2011-06-02

项目协作式翻译教学网络辅助平台的研究与设计.徐庆

链接

题名：	项目协作式翻译教学网络辅助平台的研究与设计
姓名：	徐庆
学号：	10817416
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-02
外文题名：	The Study and Design of Web-based Platform for PBL Translation Teaching
关键词：	网络翻译教学模式社会建构主义功能翻译理论项目教学法网络教学平台
外文关键词：	Web-based Translation Teaching Model Social Constructivism Functional Translation Theory Project-based Learning Web-based Teaching Platform
论文摘要：	︿网络外语教学在国内已经发展了十多年，但是网络技术在翻译教学领域的研究和应用情况并不十分理想。总体上，国内对网络翻译教学模式缺乏具体的流程规划，也缺少针对翻译教学特点设计的网络工具。因此，在国内，网络翻译教学领域存在较大的研究空间和价值。本文旨在探究一种适用于国内翻译（中英笔译）教学的网络辅助平台的设计与实现方案，即项目协作式翻译教学网络辅助平台，用以辅助课堂翻译教学。首先，结合相关文献，论述国内传统翻译教学模式的问题；并且通过实例分析和对比分析，论述国内外网络翻译教学模式及相关平台的现状、特点和不足，为模式改进和平台改进提供了针对点。接着，针对现有网络翻译教学模式的不足，结合社会建构主义翻译教学理论、功能翻译理论和项目教学法，提出了一种改进方案即项目协作式网络翻译教学模式，该模式支持项目式小组协作翻译和译句案例库自学两种优势互补的学习方式，并且将师生互动和协作贯穿于整个学习过程。然后，基于项目协作式网络翻译教学模式，对平台的功能性需求、非功能需求和技术需求进行了分析。最后，根据平台需求分析结论，对平台功能进行了详细设计和实现，主要包括小组项目协作学习和译句案例库自学两大功能模块，可以与课堂翻译教学进行整合应用。在教学应用中的初步尝试和反馈表明，平台能帮助师生在良好互动中进行小组翻译练习和译句案例库自学，有助于改善单一的传统课堂教学模式的问题，促进学生翻译能力的培养。期望本文在模式和平台方面的成果，能为将来的网络翻译教学研究提供借鉴和参考，在教学应用中发挥更大作用。﹀
外文摘要：	︿ Web-based foreign language teaching has been developing in China for over ten years. But not much has been done on the study or application of internet technology for translation teaching. Thus the field of web-based translation teaching is of great study value.Therefore in this paper, the author attempts to design a web-based teaching platform to assist domestic translation teaching.Firstly, the author does a literature review on the flaws of domestic traditional translation teaching model. Then by case study and comparative analysis, the author summarizes the flaws of the current typical web-based translation teaching models and related internet tools home and abroad.Secondly, with instructions from social constructivism, functional translation theory and project-based learning (PBL), the author proposes an improved model - PBL web-based translation teaching model. The improved model supports both group collaborative translation practice and sentence-corpus based self-learning. And it features close interaction and collaboration between teachers and students during the whole learning process.Thirdly, in the light of PBL web-based translation teaching model, the author makes a thorough analysis of the desired web-based platform’s functional feature requirements, non-functional feature requirements and technical requirements. Finally, on the basis of requirements analysis, the author makes a detailed platform design and completes its technical development.The initial use and survey of the platform shows that it much facilitates the interaction and collaboration among group translation practice and sentence-corpus based self-learning, and is good for developing students’ translation competence. ﹀
分类号：	G434
论文总页数：	104
参考文献总数：	47
馆藏号：	017/M2011(225)
公开日期：	2011-06-02

语言和流程角度的用户手册创作研究.孙晓东

链接

题名：	语言和流程角度的用户手册创作研究
姓名：	孙晓东
学号：	10817322
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-02
关键词：	用户手册质量语言流程生命周期成熟度
论文摘要：	︿质量问题长久以来困扰软件行业，而在二十一世纪，IT更是深入生活中的各个领域，市场竞争越来越激烈，企业要在不断发展中保持市场竞争力，就要应对新的要求：成本控制更严，项目周期更短，而最重要的还是要保证产品质量。用户手册作为对外发售产品的一部分，其质量问题也同等重要。根据相关国家标准和国际标准（如ISO标准等），技术文档细分有十多种，而用户手册（User Guide或User Manual）只是其中的一种。不同的技术文档在软件产品不同的生命周期中扮演不同的角色，在不同的阶段各种技术文档的重要性也发生转移：需求文档在软件开发的初期定义客户的需求；设计文档定义系统每个功能点的原理功能等；测试文档检测系统功能是否正常运转符合要求；而用户手册不同于前几种技术文档，有很多区别于其他技术文档的特点，如用户手册要作为产品的一部分直接发布，面向最终用户群体，负责帮助用户利用产品高效的完成某项任务；用户手册无需像设计文档和测试文档等介绍技术性原理等。本文专注于技术文档中的用户手册，从过程和语言两个角度来考察用户手册的质量。论文首先从语言的角度分析了用户手册的语言特点：用户手册语言受控性强，词的复杂度，可用词汇，句式都受到了严格的限制。然后文章从流程角度分析了用户手册开发过程的关键过程域，最后在CMM的基础上建立了用户手册的成熟度模型，确立了26个关键过程域。﹀
外文摘要：	︿ Software industry has long been encumbered by various quality issues. The new century witnesses that the problem is worsened by the intensifying competition and increasing quality demands. To survive the competitions, organizations must deal with the emerging requirements: less spending, shorter project cycle and ultimately higher quality. As part of the product releasing to the market, user manual is also of great significance.According to national and international standards, technical documentation is sub divided into more than ten categories. SRS (Software Requirement Specification) documents define users’ requirements; design documents define the logic and function of each feature; test documents record test cases and running results. These documents are all restricted to internal use. While user manuals are quite different: they are released to the market for end users finishing a task efficiently with little or no engagement of the software provider; user manuals are spared from describing the fundamental and technical logic of the product, et al.The thesis focuses on quality of user manuals from the following two perspectives: process and language. The thesis analyzes quality factors from a new angle by studying its life cycle; the thesis analyses the influence of editors and maintenance on manual quality based on the author’s working experience; the thesis establishes a user manual maturity model based on study of current maturity models and feedbacks from users and technical communicators. ﹀
分类号：	TP311/TP312
论文总页数：	78
参考文献总数：	0
馆藏号：	017/M2011(183)
公开日期：	2011-06-02

基于交际翻译理论的技术文档本地化.苏昊明

链接

题名：	基于交际翻译理论的技术文档本地化
姓名：	苏昊明
学号：	10817314
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-02
关键词：	交际翻译理论技术文档本地化
论文摘要：	︿随着全球化的脚步加快，技术文档的本地化在跨文化交流中有着至关重要的作用，其翻译效率及质量直接影响着软件本地化的进程。本文写作背景是实习过程的项目经历，本文尝试提出符合中小软件项目技术文档本地化的流程，并在翻译过程中，使用纽马克的交际翻译理论进行指导，旨在完成的英文文档符合英语国家科技文章的语言习惯，从而提高用户满意度。本文对于技术文档本地化进行了全面的研究，综述了本地化的相关概念和发展现状，指出当前中小软件项目技术文档本地化过程中面临的问题和挑战。本文要求译者在本地化过程中，既要尊重原文信息的真实性和准确性，又要给读者较多的重视，准确、流畅、清晰、简洁且符合目标语言表达习惯的译文才是译者应当重点关注的。本文优化了本地化流程，提出应当进行用户分析、翻译风格规范、用户反馈意见整理等，并在实际翻译过程中根据纽马克的翻译理论，从词汇、句法和篇章角度进行文体分析，根据翻译风格指南的要求进行翻译，引入计算机辅助翻译软件和质量控制软件从而提高翻译速度和翻译质量。最终，本文以实际项目未依托，解决本地化项目进行过程中涉及各个方面的问题，并通过问卷调查的方式对译文质量和客户满意度进行量化的分析，评价结果表明新型本地化流程和翻译风格指南能够提高技术文档项目质量。﹀
外文摘要：	︿ With the development of globalization, the localization of technical document plays a significant role in cross-culture communication. The efficiency and quality of the technical document produce great and immediate impact on software localization progress. This research depends on the intern experience in a Singapore company. The paper attempts to create the document localization process which corresponds with the requirement for medium and small software projects. With tenet of meeting customs' needs, the translation, which aims to complete the high quality English version of document, is under the guidance of Newmark’s communicative translation theory. This paper presents the comprehensive study of technical document localization from various perspectives. With the introduction of the related concepts of localization and the current development at the international level, this paper indicates the obstacles and challenges in the course of technical document localization for the medium and small software projects. The translator is demanded to not only respect the truth and accuracy of information of the original text, but also emphasize the significances of the readers. Preciseness, clarity, conciseness and conformity to the habit of the target language are what the translator required to strive for. The purport of the research is to optimize the procedure of localization with the sections such as user analysis, translation style guide, feedback options. Under the guidance of communicative translation theory, this paper create the style guide on the basis of stylistic analysis from the lexical level, syntax level and section level and apply computer aided translation tools and QA software to improve the translation quality. In the end, the author applies the optimized procedure to the technical document localization project to solve the related problem and analyses the quality and user satisfaction of the translation achievement by questionnaire survey. The evaluation result indicates that the localization procedure and translation style guide have positive effect on the quality of the technical document localization project. ﹀
分类号：	TP311.52/H085
论文总页数：	68
参考文献总数：	0
馆藏号：	017/M2011(179)
公开日期：	2011-06-02

运用翻译技术实施医学翻译教学改革.孟阳子

链接

题名：	运用翻译技术实施医学翻译教学改革
姓名：	孟阳子
学号：	10817287
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-02
关键词：	翻译技术医学翻译教学案例教学医学翻译案例翻译质量翻译效率
论文摘要：	︿随着我国医学事业的发展和医学领域对外交流频度的增加，医学翻译的需求量越来越大，而短时间内胜任医学翻译的译员人数难有大幅度的增加。面对医学翻译实践的这种日益严峻的挑战，译者应从翻译技术中寻求支持，从而能够高质高效地完成规模大难度高的医学翻译项目。目前，翻译技术在我国的医学翻译教学中尚未得到有效地推广和应用。传统的医学翻译教学内容侧重于医学翻译理论和医学翻译技巧，普遍采用以教师为中心的教学方法。这种传统的教学模式，无法有效地提高学生分析、解决实际翻译问题的能力。此外，在这种传统的教学模式培养下，学生利用翻译技术解决实际翻译问题的意识淡薄，以致翻译质量和翻译效率难以得到有效的提高。笔者借鉴了在商业、法律、医学领域普遍使用的“案例教学法”，并结合自身的医学翻译实践和翻译技术背景，提出了引入翻译技术，采用案例教学法，进行医学翻译案例教学，即“运用翻译技术实施医学翻译案例教学”，以提高学生医学翻译实践的质量和效率。笔者通过教学实验项目，以提高学生医学翻译实践质量和效率为目的，对两个班的学生实施了不同的教学。“普通班”的学生接受“翻译技术教学”；“实验班”的学生接受“运用翻译技术的医学翻译案例教学”。通过对比两个班的学生进行翻译实践的质量和效率，得出以下结论：运用翻译技术实施的医学翻译案例教学，可以更加有效地提高学生医学翻译实践的质量和效率。笔者在教学实践中采集和整理了部分案例，在第二章对这些医学翻译案例做了集中介绍和分析。其中的部分案例被用于教学实验项目。案例研究对充实案例库、对同行今后的教学实践提供了借鉴和参考。全文的研究分为四章。第一章，笔者首先分析了我国翻译技术教学的现状、翻译教学改革的情况以及医学翻译教学改革的必要性。然后，在分析回顾了国内外相关研究的基础上，笔者提出了本文需要解决的问题，“提高学生医学翻译实践的质量和效率”。最后，笔者介绍了本文的研究内容和研究方法。第二章，笔者首先对医学翻译实践和教学中普遍使用的翻译技术工具进行了研究。然后，笔者运用翻译技术工具对医学翻译案例进行了分析，总结了翻译技术工具解决医学翻译实践问题的优势和不足。第三章，笔者对教学实践的个案进行了分析，即“运用翻译技术实施的医学翻译案例教学”个案。笔者从教学设计、课堂实施、教学评价三个方面总结了教学个案的实施情况。第四章，笔者首先分析了“运用翻译技术的案例教学”与“独立的翻译技术教学”之间的差异。然后，笔者通过实验研究评价了“运用翻译技术的医学翻译案例教学”对提高学生医学翻译实践质量和效率的效果。通过对比两个班的学生从事医学翻译实践的质量和效率，笔者得出结论：运用翻译技术实施医学翻译案例教学，能够更好地提高学生医学翻译实践的质量和效率。最后，笔者对全文的研究工作做出了总结和展望。﹀
分类号：	H085//TP311
论文总页数：	93
参考文献总数：	0
馆藏号：	017/M2011(165)
公开日期：	2011-06-02

关联理论下用户手册翻译研究——以打印机用户手册翻译为例.徐尧

链接

题名：	关联理论下用户手册翻译研究——以打印机用户手册翻译为例
姓名：	徐尧
学号：	10817418
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-02
关键词：	打印机用户手册关联翻译理论推理过程
论文摘要：	︿随着全球化进程的加速，国家间的距离也越来越小，但是语言却始终是一种障碍。打印机用户手册的翻译问题就是这种障碍的表现之一。导致翻译问题的原因包括译员本身翻译能力、翻译流程和翻译工具限制，但主要是因为缺乏合理的翻译理论为指导。本文通过问卷调查，筛选出用户接触最多，翻译质量相对较高的佳能打印机用户手册作为研究语料。基于对语料的分析，指出现行打印机用户手册翻译标准缺乏翻译理论的指导，存在缺陷，缺乏可操作性。接着本文将关联理论引入打印机用户手册翻译中。根据关联翻译理论和科技翻译机理，本文引入了全新的打印机用户手册翻译思路——合理的翻译过程可以看做专业知识、语境信息和语言知识三个因素之间的推理过程。通过对语料的分析，本文从专业知识、语言知识和语境信息三个方面总结了打印机用户手册的特点。以此为基础，分析打印机用户手册翻译中的三类推理过程：专业知识—语言知识推理、专业知识—语境信息推理和语境信息—语言知识推理。进一步得出两类推理过程可能包含的推理情形。针对每个推理情形，为打印机用户手册词语、句子的翻译选择合适的翻译方法和翻译技巧。最后通过实践论证，证明了新翻译思路可以有效的指导译者翻译打印机用户手册。﹀
分类号：	H059/TP391
论文总页数：	108
参考文献总数：	0
馆藏号：	017/M2011(226)
公开日期：	2011-06-02

软件本地化项目风险管理解决方案.杨丽霞

链接

题名：	软件本地化项目风险管理解决方案
姓名：	杨丽霞
学号：	10817428
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-02
关键词：	本地化项目项目管理风险管理
论文摘要：	︿本地化行业兴起于二十世纪七十年代末，该行业以其自身特性决定了它是在不同地区，不同国界进行的跨地区跨国界专业生产活动。“本地化”的努力通常必须同公司的整体产品周期、预算和质量标准保持一致，方能获得预期的效果——而要达到这种平衡，就必须将本地化项目纳入公司的业务流程，来进行统筹计划，否则，类似项目所耗费的时间及成本可能会超过预期收入，从而得不偿失。因此，本地化项目的风险管理显得尤为重要。良好的风险管理非常有助于项目取得成功，否则项目失败的概率就会大大增加。本文根据风险管理的基础理论方法，结合本文作者在实习期间参与和负责的多个本地化项目实践经历，对本地化项目实施过程中的风险因素进行了分析，提出了本地化项目风险管理的解决方案。最后结合实际案例，详细论述了该解决方案在本地化项目中的应用，验证了风险管理所产生的价值。本文共分为六章。第一章介绍了本文的选题背景及意义、问题的提出过程、相关研究的现状以及本文的研究方法、内容、目标与组织结构。第二章综述了风险管理的相关理论，并讨论了相关理论的特点。第三章分析了本地化项目风险管理现状，并以实例证明在本地化项目中引入风险管理的必要性。第四章提出了本地化项目风险管理的一个解决方案。同时对Leavitt模型进行了改良，使用改良后的模型为基础详细分析了本地化项目实施过程中的风险因素。第五章介绍了X本地化项目及其实施过程，详细论述了风险管理方案在X本地化项目中的应用过程，并验证了该风险管理方案在本地化项目管理中的积极效果。第六章为总结与展望。本文的贡献主要体现在：第一，提出了一个针对本地化项目的风险管理解决方案。第二，对Leavitt模型进行了改良，并在风险识别阶段采用改良后的Leavitt 模型进行全面的风险因素分析，同时采用风险矩阵与Borda序值法进行风险分析，得出了影响本地化项目成功的关键风险因素。第三，结合实际案例进行了验证和应用，对提高本地化项目成功率，帮助企业在竞争激烈的市场环境下获得更大的利益都有一定的指导意义。﹀
分类号：	TP311.52/F224.5
论文总页数：	77
参考文献总数：	0
馆藏号：	017/M2011(232)
公开日期：	2011-06-02

计算机辅助翻译工具的测评框架与测评.高志军

链接

题名：	计算机辅助翻译工具的测评框架与测评
姓名：	高志军
学号：	10917163
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2011-06-02
外文题名：	The Evaluation Framework and Evaluation of Computer-aided Translation Tools
关键词：	计算机辅助翻译软件测评计算机辅助翻译工具测评框架
外文关键词：	Computer-aided Translation Software Testing and Evaluation Computer-aided Translation Evaluation Framework
论文摘要：	︿随着语言服务行业的不断迅猛发展，越来越多的翻译人员为了提高工作效率和翻译质量已开始考虑使用或已使用计算机辅助翻译工具，然而当前计算机辅助翻译工具种类繁多，且质量也是良莠不齐。为了帮助翻译人员选择合适的计算机辅助翻译工具，中国翻译协会本着为翻译行业服务的原则，成立了专门的科研项目来对当前计算机辅助翻译工具进行系统测评，并将该科研项目委托给北京大学语言信息工程系实施。研究小组将本次测评项目分为四个阶段:1)研究测评框架;2)构建计算机辅助翻译工具质量模型和用户需求模型;3)对计算机辅助翻译工具进行测评;4)依据测评结果向用户推荐工具。本文写作期间处于“研究测评框架”阶段，因此本文作者的主要目标则是提出一种可行的测评框架，并验证该框架的可行性。为了实现本文目标，本文所做主要工作如下：a.以软件质量模型及构建软件质量模型的方法为基础，提出了构建计算机辅助翻译工具质量模型的方法；b.分析了用户类型并参考Höge博士的翻译需求抽取理论，总结出了构建用户需求模型的方法；c.确定了本文中所使用度量的标度类型以及标度-值的映射函数，为软件评估做好了准备；d.挑出了线性加权模型作为计算软件质量权重总值的方法。上述工作即构建测评框架的过程。最后，为了验证框架的可行性，本文以北京大学计算机辅助翻译工作室的需求为实例，测试了应用该框架来挑选Trados、Wordfast和雅信这三款计算机辅助软件的结果。验证实验表明该框架的方法可行，结果准确。同时，实验过程中构建的计算机辅助翻译工具的质量模型和实验用例，则可作为今后类似研究的起点。﹀
外文摘要：	︿ With the rapid development of language services industry, more and more translators are planning to adopt Computer-aided Translation tools (or they have already used them) to improve their work efficiency and translation quality. As more and more Computer-aided Translation tools, ranging from very good to very bad, become available to the market. It becomes difficult for a translator to choose a most suitable tool. For this reason, the Translators Association of China, a society serving the translation industry, set up a special research project to evaluate currently available Computer-aided Translation tools, and this research project is entrusted to the Department of Language Information Engineering, Peking University.The research team divided the project into four phases: 1) Study the framework for evaluation; 2) Build the quality model of Computer-aided Translation tools and requirement models; 3) Evaluate Computer-aided Translation tools; 4) Recommend tools to translators in accordance with the evaluation results. When the author is writing the paper, it is still in phase one. As a result, the aim of the author is to raise an evaluation framework and validate its feasibility. To fulfill the goal of the paper, the author mainly did the following work: a. Put forward a method to build the quality model of Computer-aided Translation tools, which is based on Software Quality Model and the methods to build software quality models; b. Analyzed the user profiles and summarized a method to build user requirements models in reference to the translation requirement elicitation theory brought by Doctor Höge; c. defined the metric scales used in the paper and the mapping functions between scale and value; d. picked Linear Weighted Attribute Model to calculate weighted summary of software quality during the part of executing evaluation.Finally, the feasibility of framework is validated by the instance of PKU-CAT Studio. The framework is used to pick out the best suitable tool for PKU CAT Studio among Trados, Wordfast and Yaxin. The validation experiment showed that the framework is feasible to implement and the result is accurate. Meanwhile, the quality model of Computer-aided Translation tools and test case prepared during the experiments could be used a start point for similar researches in future. ﹀
分类号：	TP391.2
论文总页数：	115
参考文献总数：	39
馆藏号：	017/M2011(330)
公开日期：	2011-06-02

基于动词模式匹配的英语写作自动批改的研究与实现.靳光洒

链接

题名：	基于动词模式匹配的英语写作自动批改的研究与实现
姓名：	靳光洒
学号：	10817183
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	俞士汶
导师2单位：	北京大学计算语言学研究所
论文答辩日期：	2011-06-02
关键词：	作文自动评价动词模式不自然表达
外文关键词：	Automatic Essay Grading verb pattern unnatural expression
论文摘要：	︿英语写作自动批改是教学信息化的重要内容，是教育现代化的必然发展趋势。写作自动评价研究具有较高的学术价值和商业价值。一方面，中国英语师资力量仍十分薄弱。课堂英语教学中，平均师生比例小于1：150。教师教学压力沉重。另一方面，课堂英语教学多由中国教师进行。由于教师也是二语学习者，因此难于识别学生写作中不地道、不自然的表达。然而，现有的英语作文自动批改系统多未充分考虑中国英语学生作为二语学习者的特点，不适合用于课堂教学中英语写作练习的评价。为满足二语教学中作文批改的需求，本文针对中国英语学习者的特点，进行了自动批改系统的设计与开发研究。认为不自然表达是影响中国英语学习者写作质量的重要因素。不自然表达是指不符合母语人士的表达和阅读习惯的短语和句子。是外语学习者受母语的影响，单纯依据语法规则生成的在外语中不存在或生僻的表达。基于模式语法，对不自然表达的产生原因、主要形式进行探究。并提出通过提取动词模式，对不自然表达进行识别。为验证方法的可行性，本研究设计并实现了一个原型系统，使用浅层语义分析技术提取动词模式。系统分为三个模块：标准动词模式库、学生作文动词模式提取模块、匹配模块。并建立测试集，对系统进行测试。通过与中国教师批改结果以及英语母语人士批改结果进行对比分析，发现使用动词模式匹配的方法能够有效识别中国英语学习者作文中的不自然表达，正确率为67%。自动识别的不自然表达数是中国教师人工识别数的3.55倍。﹀
外文摘要：	︿ Automatic Essay Grading system, as part of the education IT solutions, is being increasingly adopted by education institutes in China. The pressing need places high value on AEG research. On one hand, English teaching resources in China is limited. One English teacher teaches more than 150 students in average. Moreover, most of English teachers in China are native Chinese, who are second language learners of English themselves. It is difficult for them to recognize the unnatural expressions in students' essays. On the other hand, current essay grading systems are mainly designed for native English learners, which are not suitable to be applied in grading the essays written by Chinese English learners.This study was conducted to meet the growing need for automatic grading of Chinese English learners' essays. It is pointed out that unnatural expressions, which refer to the phrases and sentences sound strange to native speakers and in most cases coined by Chinese learners, are a significant bottleneck for advanced second language learners. The cause and type of unnatural expressions are investigated based on the pattern grammar. And it is noted that verb patterns can be used to find unnatural expressions.Semantic role labeling was employed in implementing the system, which was mean to determine whether verb pattern matching can recognize unnatural expressions effectively. The system encompasses a standard verb pattern library, a verb pattern recognition module and a matching module. It can identify 67% of the unnatural expressions in the pilot corpus of the essays of advanced Chinese English learners. The unnatural expressions recognized by the system are 2.55 times higher than that identified by Chinese teachers. ﹀
分类号：	G434/TP37
论文总页数：	71
参考文献总数：	39
馆藏号：	017/M2011(112)
公开日期：	2011-06-02

2010-12-01

计算机辅助的本地化翻译质量检查系统研究与实现.丁矗

链接

题名：	计算机辅助的本地化翻译质量检查系统研究与实现
姓名：	丁矗
学号：	10717100
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	信息科学技术学院
导师2姓名：	俞敬松
导师2单位：	外国语学院
导师3姓名：	王继辉
论文答辩日期：	2010-12-01
外文题名：	Research and Implement of the computer- assisted translation quality-check tool
关键词：	计算机辅助翻译翻译质量检查系统本地化翻译质量检查
外文关键词：	computer-aided translation quality-check translation quality-check tool localization translation quality-check
论文摘要：	︿随着信息技术的发展，语言相关的技术也取得了很大进步，在本地化翻译行业中，由于其行业具有翻译量大，时间紧迫，技术应用性强等特点，所以普遍采用计算机辅助翻译（CAT）工具来进行本地化翻译工作，在这个过程中，本地化行业中的客户方、本地化语言服务提供商都面对着质量，时间和成本三方面的巨大压力，对翻译质量检查的自动化方案有着强烈的需求，计算机辅助翻译质量检查（CATQA）的概念应运而生，本文讨论了本地化翻译的特点，设计并实现了一种计算机辅助翻译质量检查系统，专门用于对本地化翻译质量进行检查。本系统包括三个模块：输入文件解析模块、翻译质量检查模块、错误报告输出模块。输入模块用于导入翻译之后的文件，对输入的文档格式进行解析；检查模块从格式，内容两大方面对所输入的文件进行翻译质量检查；输出模块用于生成格式化的翻译质量检查报告，反馈给用户。本文所涉及的课题是一项语言翻译学与计算机科学交叉的综合性研究，笔者对本地化翻译质量检查的自动化方法做了如下工作：1，针对传统纯人工方式校对所面临的诸多挑战，建立了一种面向本地化行业的计算机辅助翻译质量检查模型，使人和计算机达到协同分工；2，从计算机可计算的角度分析了常见的翻译质量检查项，比如漏译，翻译一致性，重复字，不完全翻译，内嵌代码检查，基于正则表达式的原文模式和译文模式比较，乱码检查，源语言错误检查，术语检查等，以国际化软件开发策略，设计并实现了一种支持多国语言的计算机辅助翻译质量检查系统（CATQA）。计算机辅助翻译质量检查系统（CATQA）的产生将提高译员、本地化服务提供方以及客户发包方在翻译质量检查方面的效率，帮助校对人员快速发现翻译相关的错误，并最终提高翻译的质量。﹀
外文摘要：	︿ With the rapid progress of information technology, the language engineering technology has also made great progress. In localization industry, owing to the great pressure from time, cost and quality, stakeholders have strong demand for effective solutions that can reduce costs, save time, and improve quality. So the automated translation quality-check software is born. This thesis focuses on the computer- assisted translation quality-check tools (CATQA) and, based on the author’s understanding of the characteristics of localization, discusses the design and implementation of a CATQA tool. The thesis is intended to bring together such theories and practices as those in computational linguistics, general linguistics and translation studies together in the discussion of what follows: 1.a model for the CATQA and its application, and 2.design and implement of a prototype CATQA system for localization quality assurance. The computer-assisted translation quality-check tool is designed to check the source text and the corresponding translation, report the translation errors in order to save human revisers’ time and effort by relieving them of the more mindless and mechanical aspects of their work. ﹀
分类号：	TP311.56/H085
论文总页数：	55
参考文献总数：	28
馆藏号：	017/M2010(655)
公开日期：	2013-12-01

基于语义图和半指导学习方法的关键词获取技术研究.李德聪

链接

题名：	基于语义图和半指导学习方法的关键词获取技术研究
姓名：	李德聪
学号：	10817197
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	常宝宝
导师2单位：	信息科学技术学院
导师3姓名：	李素建
导师3单位：	信息科学技术学院
论文答辩日期：	2010-12-01
外文题名：	Research of Key Phrases Acquirement Based on Semantic Graph and Semi-supervised Learning
关键词：	关键词获取语义图半指导学习隐含关键词归纳式半指导学习
外文关键词：	Key Phrases Acquirement Semantic Graph Semi-supervised Learning Latent Key Phrases Inductive Semi-supervised Learning
论文摘要：	︿关键词是表达一篇文档的主要内容的单词或短语。对于信息检索、文本聚类等许多自然语言处理应用来说，关键词获取是一项重要的基础性工作。因此，本文围绕关键词获取技术进行研究，并提出基于语义图和半指导学习方法挖掘文档中的常规和隐含关键词。通常来说，一篇文档中的词汇并不孤立地表达文档的内容。为了捕获并更好的利用词汇之间的关系，我们在本文中提出基于维基百科中的信息把待获取关键词的文章抽象成一张超图形式的语义图。在该图中，词汇之间的二元关系和多元关系都被形式化的表达。基于如下被普遍接受的假设：标题通常能够出色地概括文档的核心内容，因而关键词倾向与标题存在紧密的语义关系，由此我们提出了两种计算词汇的重要性的方式，第一种方式应用基于超图的半指导学习，在该方式中，文档标题的影响通过语义图迭代地传播给其它词汇。另一种方式考查词汇与文档主题的关系以及它与其它词汇的相关程度，并把这两种因素综合起来。最终我们依据词汇的重要性选取关键词。这两种方式在我们的实验中都取得了较好的效果。其中，第一种方式效果更为突出。通常意义上的关键词来自文档本身，这些关键词称为常规关键词。然而，有些词汇从没有在文档中出现过，但它们仍然可能适合作为这篇文档的关键词，本文称之为隐含关键词。在获取常规关键词的基础上，本文提出了一种获取隐含关键词的方法，该方法以常规关键词获取过程中建立的语义图和语义图中词汇的重要性为基础，应用基于超图的归纳式半指导学习方法计算候选词的重要性，进而依据重要性选取隐含关键词。实验结果表明，尽管隐含关键词获取的质量逊于常规关键词的质量，但仍具有一定的可信度；并且所获取的隐含关键词总体上和原文的相关度是比较高的。﹀
外文摘要：	︿ Key phrases are defined as the phrases that express the main content of a document. To many natural language processing applications such as information retrieval, text clustering and so on, it is a fundamental and important task to acquire key phrases. Hence in this thesis we do research on key phrases acquirement technology, and propose to mine ordinary and latent key phrases based on semantic graph and semi-supervised learning.Generally, phrases in a document are not independent in delivering the content of the document. In order to capture and make better use of their relationships in key phrase extraction, we suggest exploring the Wikipedia knowledge to model a document as a semantic graph with the form of hypergraph, where both n-ary and binary relationships among phrases are formulated. Based on a commonly accepted assumption that the title of a document is always elaborated to reflect the core content of a document and consequently key phrases tend to have close semantics to the title, we propose two approaches to calculate the phrase importance. The former applies hypergraph-based semi-supervised learning, during which the influence of title phrases is propagated to the other phrases through the semantic graph iteratively. The later exams the relevance of a phrase to the document's topic, and the relatedness between other phrases and it, then combines them together. Finally we select key phrases according phrase importance. Both of the two approaches perform well in our experiments, and the first one works better.Most key phrases come from the document itself, named ordinary key phrases. There may be, however, several phrases that do not appear in the document but still be appropriate to be key phrases. Here we call them latent key phrases. On the basis of the extraction of ordinary key phrases, we propose a way to acquire latent ones, which relies on the semantic graph and the phrase importance produced when extracting ordinary key phrases, uses hypergraph-based inductive semi-supervised learning to compute candidate phrases' importance, and then selects latent key phrases according to the importance. Experimental results demonstrate that although the quality of acquired latent key phrases is not as great as that of ordinary ones, it is still reliable somehow. In addition, these latent key phrases are highly related to the document on the whole. ﹀
分类号：	TP311.13/H146.3
论文总页数：	51
参考文献总数：	33
馆藏号：	017/M2010(680)
公开日期：	2010-12-01

搜索引擎查询建议系统的设计与实现.母亦翔

链接

题名：	搜索引擎查询建议系统的设计与实现
姓名：	母亦翔
学号：	10817288
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2010-12-01
关键词：	搜索引擎查询建议数据清理数据挖掘检索与排名
论文摘要：	︿信息检索中的所谓词语失配问题，是指用户的查询条件与文档词汇内容失配。此外，在WEB检索服务中，用户可能更倾向于使用较短查询词，所以我们需要查询词建议方法作为辅助，以增强用户体验。查询建议技术通过交互的过程，让用户选择系统为其建议的扩展查询，以更好、更快满足其信息需求。目前的主要商业搜索引擎均提供查询建议服务，由用户查询输入期间的查询建议和搜索结果展现后的查询建议两部分组成。本文作者在搜狗公司的实习过程中，参与设计并实现了一个在用户输入检索词时提供查询建议的系统，主要基于查询日志挖掘技术实现。系统可以做到，当用户在搜索框进行查询输入的时候，预测用户的输入意图，给出最多10个查询词候选，并按点击偏好进行排序，以下拉框的形式展现给用户。产品人员提出了查询建议系统的五个工程目标：提高系统的数据质量；提高TOP位置点击率；提高系统的数据覆盖度；提升系统的用户体验；提高系统的时效性。作者与项目开发人员一起，从系统的设计与实现角度进行详细的研究分析，使最终的系统满足了这些目标。论文首先介绍了系统的整体设计方案：为达到时效性与数据覆盖度的平衡，我们设计了层次结构的查询建议数据库；在系统架构上，整个系统可以分为在线的前台部分和离线的后台部分，前台负责用户交互、更新数据、对查询建议进行检索排名，提升用户体验；后台负责数据制作，实现数据清理、数据挖掘、特征生成和索引制作功能，提升数据质量，并挖掘深层次的用户行为信息。论文详细论述了各个模块的功能、模块之间的协作关系以及整个系统的运营流程。在系统的实现部分，作者负责开发了三种关键技术：查询建议的获取与挖掘技术、查询建议的数据清理技术以及查询建议的检索与排名技术，论文详细的讨论了这些关键技术中采用的算法。作者与开发人员一起设计了查询建议系统的效果评估方案，评估数据质量与数据覆盖度、时效性以及排名合理性。论文给出了各种关键技术上线后，查询建议系统指标的变化情况。通过系统评估与指标分析，可以看出查询建议系统成功的实现了工程目标。﹀
分类号：	TP393.4
论文总页数：	67
参考文献总数：	0
馆藏号：	017/M2010(689)
公开日期：	2013-12-01

敏捷开发环境下英文用户文档开发研究.金坤

链接

题名：	敏捷开发环境下英文用户文档开发研究
姓名：	金坤
学号：	10817181
论文语种：	chi
专业：	软件工程
公开时间：	1年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2010-12-01
外文题名：	Developing Quality English User Documents in an Agile Environment
关键词：	敏捷开发用户文档技术文档
外文关键词：	Agile Development User documentation technical document
论文摘要：	︿ 2001年敏捷宣言发布，迄今为止已经接近十年。如今，敏捷开发模型已经在全球范围内成为主流的软件开发模式之一。敏捷开发宣言和原则对软件开发的各个方面都产生了深远影响，用户文档开发便是其中之一。凡是稍微了解敏捷开发的人，都会对敏捷宣言中“可以工作的软件胜过冗长的文档”记忆犹新。但是此“文档”指的是传统的开发过程中所需要的和生成的规模庞大的售前文档、需求文档、设计文档、开发文档等等，并不包括用户文档，如帮助文档（Help），用户手册（User Manual），版本发布声明（Release Notes）等等。然而，用户文档无论是对于公司、软件产品还是对于用户，都有着极其重要的作用。“个人和交互胜过工程和工具；可以工作的软件胜过冗长的文档；与客户协作胜过合同谈判；对变更及时做出反应胜过遵循计划”的敏捷开发宣言给文档开发和文档工程师在信息的获取和处理方面带来了巨大的机遇和挑战。由于敏捷模型持续开发和不断交付的特点，用户文档也需要持续开发和不断交付。本文研究了在敏捷开发环境下如何高质高效地进行英文用户文档开发，在理论研究和项目实践的基础上，指出敏捷开发模型对于用户文档开发的影响主要表现在文档工程师对于项目的沟通方式、参与模式和写作模式方面。简单说来，就是协作与写作。本文深入探讨了敏捷开发环境下文档工程师与敏捷开发团队如何沟通、如何协同工作，并且对敏捷开发环境下的两种文档开发模式即线性写作（Linear writing）方式和基于主题的写作(Topic Based Writing）进行了对比和分析，本文指出，在团队协作方面，文档工程师应该使用Wiki作为主要沟通工具，主动参与敏捷开发项目，将文档开发过程融入到敏捷软件开发过程，并设计了开发团队和文档工程师合作流程图；在文档开发方面，应该采用面向主题的写作模式进行文档开发。﹀
外文摘要：	︿ It has been nearly ten years since the birth of the Agile Manifesto. And now, agile has been widely accepted as one of the major software development models. At the same time, the manifesto and principles of agile software development have profound impact on every aspect of software development, in which user documentation is no exception.Anyone who knows anything about agile software development is very likely to have “Working software over comprehensive documentation” in mind. However, what the “documentation” here refers to is the project documents including pre-sales documents, requirements documents, design documents, development documents, not user documents like Help, User Manual, Release Notes and etc. What’s more, user documentation is of great importance for companies, software products or for users. The Agile Manifesto, “Individuals and interactions over processes and tools. Working software over comprehensive documentation. Customer collaboration over contract negotiation. Responding to change over following a plan.” has changed the way that User Documents were developed. Because of the continuous development and frequent delivery of software products, User Document also needs to develop continuously and delivered frequently.This thesis puts user documentation under the agile environment and studies how to develop quality documentation in an agile project. It is stated that, in an agile atmosphere, Information Developer should make a good agile team player and adopt a topic-based writing model to produce documents that are easy to use, easy to understand and easy to search. ﹀
分类号：	TP391
论文总页数：	56
参考文献总数：	26
馆藏号：	017/M2010(679)
公开日期：	2011-12-01

基于语料库的莎士比亚戏剧中译本译者风格研究.李小怡

链接

题名：	基于语料库的莎士比亚戏剧中译本译者风格研究
姓名：	李小怡
学号：	10817217
论文语种：	chi
专业：	软件工程
公开时间：	3年后
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2010-12-01
关键词：	语料库译者风格莎士比亚戏剧朱生豪梁实秋
论文摘要：	︿译者风格研究是文学翻译研究的重要内容，朱生豪和梁实秋的翻译风格更是多年来学界的热门，相关著作可谓汗牛充栋。然而这些研究大多是基于研究者个人的主观印象和直觉判断，通过“范文详析”的方法，对译文进行定性分析。这样的译者风格研究虽对读者很有启发意义，但因其考察范围有限，很难对译者风格进行宏观和多层面的描写分析，而且这些研究成果也有待于通过定量研究的方法做进一步验证。本文既是出于这一目的，首先对传统定性研究的结果进行了总结，概括出朱生豪和梁实秋两位译者各自的翻译风格，然后，通过基于语料库的研究方法，对《终成眷属》（All Well That Ends Well）的梁实秋译本和朱生豪译本，通过标准化类符/形符比值、平均句长、句末语气词、词汇密度和词频等五个方面进行定量分析，对二位译者的风格进行比对研究，并对传统的定性研究得出的结果予以验证和分析。具体如下：第一，在词汇层面上，统计和分析了两种译本的类符、形符及标准化类符/形符比。结果表明两者词汇丰富程度相当，只是梁实秋的词汇丰富程度稍高于朱生豪译本，说明梁实秋译本用词富于变化，即用词较活，趋于创造性，而朱生豪译本语言的使用重复稍多。关于两人词汇丰富程度的差异，传统定性研究得出的结论不太一致，本文通过标准化类符/形符比的统计分析验证了梁实秋的词汇稍显丰富这一结论。第二，在句子层面上，通过比较《终成眷属》的平均句长与其他莎剧原著的平均句长，发现此部戏剧是莎剧中较为特殊的一类，即句子结构比较复杂，句子也相对较长。从梁实秋译本和朱生豪译本的比对中可以看出，梁实秋译本的平均句长较长，更贴近原文风格，而朱生豪译本的平均句长较短，更易于目的语读者接受和理解。这与传统的定性研究得出的结论大致相符，即在句子层面上，梁实秋译本在句子结构上更贴近原文的风格，而朱生豪译本句子结构也与原文不尽相符，但较易理解。第三，在语篇层面上，通过统计和比较句末语气词，朱生豪译本使用了较多的对话标记，更为细致地再现了原作中的情感，而梁实秋使用的句末语气词较少。这与传统定性研究得出的结论也基本一致，即朱生豪的译文语气变化丰富，更接近原旨而使得文本更趋近舞台效果，而梁实秋的译文或比较枯燥乏味，语气变化不多，不像舞台上的对话。而且，与朱生豪译本相比，梁实秋译本的词汇密度较高，因此其信息量较大，翻译文本中漏译情况较少，更忠实于原文。这与传统定性研究得出的结果也基本相同，即朱生豪译本有较多漏译，梁实秋译本则更忠实于原文。通过以上研究，我们也可看到两个译本的相同之处，比如在高频词的使用上。两译本都使用了较多的人称代词，这与莎剧原著基本一致，很好地体现了戏剧台词还原真实、突出个性的特点。﹀
外文摘要：	︿ Translators' style is an important research focus in literary translation studies, and many research papers have been published on the translation style of Liang Shiqiu and Zhu Shenghao. However, most of the traditional research uses “explication de texte” method, which is to analyze versions by observing grammar, vocabulary, rhetoric devices as well as stylistic features in translated texts. The traditional method is based on the researcher’s subjective impression and intuitive judgments. Moreover, the traditional analysis is limited in scope and it is difficult to get the translators’ overall style. Therefore, the results of qualitative research on translators' style need to be re-examined by quantitative research. This thesis, as an effort for the above-mentioned purpose as is based on Shakespeare’ All Well That Ends Well, first summarizes the traditional research results of the translation styles of Zhu Shenghao and Liang Shiqiu, and then attempts to get the two translators’ overall translation styles through corpus-based research. Finally, the traditional results are carefully examined and compared with the results from the corpus-based research. The results are as follows: First of all, from the lexical perspective, types and tokens, standard ratio of types and tokens are calculated and analyzed; it is found that Liang is more creative in his using of vocabulary while Zhu is more conservative in vocabulary usage. As to this aspect, the traditional qualitative research shows different judgments, different from what this corpus-based research tells us, which is that Liang’s choice of wording is slightly more creative. Secondly, according to the difference of mean sentence lengths between the two versions，we can see that on sentence level, Liang’s style is closer to the original, although Zhu’s version is easier to understand for the target readers and our judgment seems more all less the same as the previous qualitative research results. Thirdly, from the passage perspective, explicitness is more vividly shown in Zhu’s translation. According to the difference between the number of sentence-final particles，we tend to give the conclusion that Zhu is more successful in presenting the emotion of the original work, and this is also the same as that of previous researches.And fourthly, the lexical density is higher in Liang Shiqiu’s version than in Zhu Shenghao’s, from which it can be concluded that there are more omissions, deliberately or not deliberately, in Zhu’ version than that in Liang’s. This finding is also in line with the previous qualitative research results.There are, needless to say, similarities between the two versions in such area as, for instance, the high frequency in the use of personal pronouns. ﹀
分类号：	H059
论文总页数：	57
参考文献总数：	0
馆藏号：	017/M2010(682)
公开日期：	2013-12-01

2010-06-02

面向多语言服务平台的术语管理研究.李昕玥

链接

题名：	面向多语言服务平台的术语管理研究
姓名：	李昕玥
学号：	10817219
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	北京大学
导师2姓名：	何卫
导师2单位：	软件与微电子学院
论文答辩日期：	2010-06-02
关键词：	计算机科学技术领域术语管理模型术语知识隐性知识知识共享
论文摘要：	︿随着国际化的加速和科学技术的发展，科技领域的翻译需求越来越大，合格的译员供不应求。在这种大背景下，中国科学技术信息研究所开发了多语言服务平台，用以辅助人工翻译，提高翻译效率和质量。该平台集成了机器翻译系统和计算机辅助翻译系统，并实现了二者的优势互补。本文主要探讨多语言服务平台中的术语管理问题，尝试为多语言服务平台建立一个实用有效的术语管理模型。该模型以实现术语知识的共享为目标，较好的实现了翻译人员和平台的优势互补，从而保证了翻译的质量和效率。本文的研究工作主要有以下几点成果：1. 以微软技术文档的英汉翻译为实例，建立了一个面向多语言服务平台的术语管理模型。该模型充分考虑到翻译人员的工作习惯和计算机领域翻译的特殊需求，提高了人机协作的优化程度，增强了术语知识的循环积累和共享。2. 提出隐性术语知识的概念，从知识管理的角度提出了术语知识共享的模式和具体实施方法。3. 提出了平台术语库的建立方法，并在该方法的指导下完成了通用术语库和部分项目术语库的建立，为多语言服务平台准备了重要的术语知识资源。4. 文中提出的术语管理模型及实施方法对翻译公司或其他翻译团队有一定的借鉴意义。﹀
外文摘要：	︿ With the development of globalization and scientific technology, the demand for translation in the area of scientific technology has been increasing. The institution of Scientific Technology and Information of China developed a multi-language service platform to assist human translation and improve quality and efficiency of translation. The platform is integrated with machine translation systems and computer-aided-translation systems.In this study, we focus on terminology management for the platform, and build a practical and effective model for the terminology management-a task in translation projects requires high quality and is normally down with a high input of time and labor. The model will lead translators and the platform to a better way of coordination, thus ensuring the consistence and accuracy of terminology translation. The contributions of this study mainly include:1. A model of terminology management for the multi-language service platform, leading a better way of coordination and circle enrichment of terminological knowledge. 2. The concept of implicit terminological knowledge, and a method of the sharing of terminological knowledge.3. A method for building terminology database, and a large amount of terminological knowledge for the platform.4. The terminology management model and method can also serve as a reference for other translation organizations. ﹀
分类号：	G302
论文总页数：	71
参考文献总数：	28
馆藏号：	017/M2010(283)
公开日期：	2010-06-02

2010-05-28

基于用户行为的搜索引擎查询质量评估.孙宇

链接

题名：	基于用户行为的搜索引擎查询质量评估
姓名：	孙宇
学号：	10717267
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2010-05-28
关键词：	搜索引擎信息检索数据挖掘日志分析用户行为分析
论文摘要：	︿在互联网上搜索用户需要的相关信息经常会成为一项费力且无功而返的任务，这种体验在日常生活中经常出现。对于用户来说，这会使得他们很不愉快，对当前使用的搜索引擎失去信心；而对于搜索引擎服务提供商来说，用户的不满直接造成流量的下降，而流量是提供商赖以生存的资源，这对于服务提供商来说是不可接受的，所以服务提供商需要了解用户在查询上面的满意度，如果大量的用户不满意就需要改进查询结果，就要采取个别结果调整，更重要的则是从整体的结构上面对系统进行调整，比如调整排序函数等方法。因此如何判断用户对查询的满意程度就成为了一个重要的问题。考虑到用户极少会直接告诉服务提供商对于某一结果不满意，所以必须通过采取某些方法来判断用户的满意情况。一般来说，判断一个检索系统的好坏可以通过准确率、召回率、MAP、NDCG[1]，通过人工对结果进行标注，但是这样做一方面会消耗大量的人力去人工标注结果，另一方面评价只能是整体的检索性能，而不能针对特定结果，而且这种评价方法更适用于科学研究，对于实际应用检索系统来说，采用这样的评价体系显然不合理。所以需要一种能够适用于商业应用，具有可靠性、健壮性，能够自动评价查询满意度，并且能够跟踪用户使用细节的查询质量评估系统。本文提出并实现了一种利用搜索引擎日志挖掘来提取用户行为并对查询质量进行评价的自动方法，可以针对特定的查询给出满意度得分，利用该得分来判断用户满意程度。该方法有如下一些优点：首先，对于用户来说是透明的，用户感觉不到满意度评价系统的存在，这样就可以避免干扰用户的正常使用。其次，该评价系统使用了用户的行为作为隐反馈信息[2]，利用改进朴素贝叶斯模型和决策树分类器，并通过大量的人工标注数据（搜索引擎服务提供商提供）进行训练，保证训练的可靠性。最后，通过使用著名时序数据库KDB，实现了快速的数据汇总统计，通过快速检索来跟踪给定用户的点击细节。经过实际验证，该方法可以自动对用户查询进行满意度评价，准确率达到了95.6%，对于搜索引擎整体质量评估和确定低质量查询有非常好的效果，同时，通过使用该方法提供的汇总功能，对确认用户需求和低质量检索结果出现原因有极大帮助。﹀
分类号：	G354.4/TP31
论文总页数：	73
参考文献总数：	25
馆藏号：	017/M2010(145)
公开日期：	2010-05-28

T级大规模检索系统和基于Hadoop的分布式检索系统研究.杨威

链接

题名：	T级大规模检索系统和基于Hadoop的分布式检索系统研究
姓名：	杨威
学号：	10717344
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	屈婉玲
论文答辩日期：	2010-05-28
关键词：	信息检索分布式检索海量数据检索 Hadoop Streaming
论文摘要：	︿ T级大规模检索系统和基于Hadoop的分布式检索系统研究项目主要分为两部分。第一部分为基础工程作业，完成单机模式下对TB级别数据进行存储、过滤、索引、检索、结果排序等操作的信息检索系统（参照成熟算法和类似的开源系统进行独立开发）。第二部分基于开源分布式系统框架Hadoop，对第一部分工程作业进行分布式化部署，实验和研究。其中第二部分涉及到新技术的引用，工程问题解决与开源系统的个性化开发等，是本项目的重点。分布式计算理论是检索业界最热方向之一，为未来信息检索行业的发展指明了发展方向。本系统依托Hadoop这个开源的平台，在数据索引检索模块实现了面向纵向分布式计算的MapReduce模型。而在锚文本归并和网页质量分析方面，使用了横向的MapReduce模型。本项目对Hadoop Streaming的使用进行了深层次的探索，其中在Streaming任务的IO管理，side-files管理与收集，负载均衡方面进行了探索，提出了自己的见解。同时根据项目需求，尝试为Hadoop Streaming系统开发Inputformat与Outputformat处理模块并进行了实验。在第四章最后部分，提出了Hadoop Streaming目前存在的部分问题。项目完成了针对6亿英文网页数据的预处理、索引、检索实验（使用TREC09检索相关数据集）。并计算了BM25得分、PageRank得分、SiteRank得分、网页特征相关得分（色情词、压缩比、词长、标题长、锚文本），并在效率、性能和结果满意度上与基于Lucene的开源检索系统进行比较。结果显示本系统在信息检索评价指标MAP和P@10上要优于Lucene，并且在分布式模式下，随着工作节点增加，速度会得到明显提升并超越Lucene。﹀
分类号：	TP316.89/G252.7
论文总页数：	75
参考文献总数：	25
馆藏号：	017/M2010(183)
公开日期：	2010-05-28

2010-05-27

影视字幕翻译研究与项目管理.陈曌赟

链接

题名：	影视字幕翻译研究与项目管理
姓名：	陈曌赟
学号：	10817095
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
论文答辩日期：	2010-05-27
关键词：	影视翻译字幕 Chaume 两种翻译模式对比研究
论文摘要：	︿影视翻译作为新兴的翻译研究领域，目前尚未引起国内翻译学界足够的重视与关注。相对于影视作品巨大的社会影响力而言，影视翻译的相关研究尤显不足。近年来我国大量引进外国的影视作品，影视翻译实践却缺乏系统的理论指导和统一的翻译规范。西方国家在影视翻译研究领域起步较早，在理论方面和实践方面都更为系统和成熟，因此，借鉴国外先进的字幕翻译理论及模式对规范我国字幕翻译工作具有积极意义。本文基于西班牙字幕翻译学家Chaume提出的两种翻译模式：“预翻译-字幕改编-时间轴制作”模式和“预翻译-时间轴制作-字幕改编”模式，[1]对两种模式进行了比较与选择，明确提出了倾向于后者的观点，并结合本文作者组织实施的Being Erica字幕翻译项目，对“预翻译-时间轴制作-字幕改编”模式的优越性进行了深入研究和有力论证。本文正文共分为五章。第一章阐述了选题的背景及意义，研究的内容、方法与目标以及 Being Erica的项目内容。第二章综述了西方与我国的影视翻译理论研究概况，分析了在理论研究方面我国与西方形成差距的原因，阐述了字幕翻译的相关概念。第三章对西班牙字幕翻译学家Chaume提出的两种字幕翻译模式进行了比较与选择，明确提出了作者的观点，分析了两种模式中的三个环节：预翻译、字幕改编和时间轴制作。第四章结合Being Erica项目，对两种翻译模式共有的预翻译环节进行了阐述，分析了计算机辅助翻译软件DejavuX和翻译功能目的论在本环节中的应用；重点针对两种模式的差异进行分析，归纳了两种字幕翻译模式各自的优缺点，通过对比，对“预翻译-时间轴制作-字幕改编”模式的优越性进行了有力的证明，最后简述了Being Erica项目的实施及收尾情况。第五章为总结与展望。﹀
分类号：	H059
论文总页数：	0
参考文献总数：	24
馆藏号：	017/M2010(261)
公开日期：	2010-05-27

高质高效的社交游戏本地化项目研究——以《开心厨房》为例.卢玉莹

链接

题名：	高质高效的社交游戏本地化项目研究——以《开心厨房》为例
姓名：	卢玉莹
学号：	10817263
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	钱多秀
导师2单位：	北京航空航天大学外语学院
论文答辩日期：	2010-05-27
关键词：	本地化本地化流程社交游戏本地化功能目的翻译理论
外文关键词：	Localization Social Game L10n L10n Procedures Skopos Theory
论文摘要：	︿经济全球化(Economic Globalization) 是指世界经济活动超越国界，通过对外贸易、资本流动、技术转移、提供服务、相互依存、相互联系而形成的全球范围的有机经济整体[1]。经济全球化的最好体现，恐怕莫过于如今飞速发展、联络世界各个角落的互联网络经济。在网络产品开发者和营销者的眼里，经济没有国界，产品也要超越国界，于是，他们借助先进的技术和方便的网络，极力要将自己的产品推向世界大市场。然而，如果仅凭借先进的技术和方便的网络，他们往往还不足以获得成功，因为国界另一边的人们讲着不同的语言、有着不同的文化，于是，为了迎合潜在顾客的口味，产品推出者们不仅为自己的产品“换上了新包装”，有的甚至将产品内容做了调整，让自己的产品“讲”对方的语言并适应对方的文化。随着产品向国际市场推出的不断增多，一个专门致力于产品新包装与调适的行业——本地化行业诞生了。一般来讲，本地化对象往往包含软件、网页和媒体文档等。本文所研究的是一种网络产品的本地化——社交游戏的本地化。社交游戏又名插件游戏，英文全称为Social Game，是一种运行在SNS社区内，通过趣味性游戏方式增强人与人之间交流的互动网络软件[2]。根据业内人士预计，这种已然火热的游戏还有望在未来5年带来超过百亿美元的利润[3]。然而，由于社交游戏流行太快，也由于“时间差”会在很大程度上决定“成败”，游戏开发商往往是尚未将开发的产品进行本地化就把简单翻译过的版本迫切地推向了国际市场，所以专业的社交游戏本地化很少，这方面的研究更是几乎为零。鉴于此，本论文以一个社交游戏本地化案例为研究对象，一方面结合功能目的翻译理论探讨了实现高质量社交游戏翻译的策略与方法，另一方面结合项目管理原理探索了实现高效率社交游戏本地化的流程。一般来讲，本地化对象往往包含软件、网页和媒体文档等。本文所研究的是一种网络产品的本地化——社交游戏的本地化。社交游戏又名插件游戏，英文全称为Social Game，是一种运行在SNS社区内，通过趣味性游戏方式增强人与人之间交流的互动网络软件[2]。根据业内人士预计，这种已然火热的游戏还有望在未来5年带来超过百亿美元的利润[3]。然而，由于社交游戏流行太快，也由于“时间差”会在很大程度上决定“成败”，游戏开发商往往是尚未将开发的产品进行本地化就把简单翻译过的版本迫切地推向了国际市场，所以专业的社交游戏本地化很少，这方面的研究更是几乎为零。鉴于此，本论文以一个社交游戏本地化案例为研究对象，一方面结合功能目的翻译理论探讨了实现高质量社交游戏翻译的策略与方法，另一方面结合项目管理原理探索了实现高效率社交游戏本地化的流程。﹀
外文摘要：	︿ Economic Globalization, an organic whole of the world economy with trade, capital, technology, service and communication surmounting national boundaries and cultural differences, finds one of its embodiments, the best one maybe, in the fast-developing and world-connecting internet economy. In the eyes of networking product producers and marketers, there exists no boundary but opportunities, so they set their minds on conquering the global market with the help of technology. However, time has come to tell them that great success takes more than technology, and that for catering to the need of customers with a different language and culture they had better give their products a localized look and feel. More and more of them have listened and taken action, hence the gradually booming localization industry. Generally speaking, localization refers to the localization of internet or computer products such as those of software and websites. And the localization this thesis is trying to deal with is called social game localization, which combines some of the features of software and website localizations. This social game industry is young, and has been growing fast and seems to be gaining momentum, and is expected by some insiders to bring about over 10 billion U.S. dollars in the near future, so to localize these games before launching them worldwide would be of great business value. However, because the industry is too young, and because most developers of these games feel reluctant to “waste” some precious time, not much practice of such has been done, not to mention academic researches in this aspect. Therefore, this thesis decides, based on a social game localization example, to bring about a set of social game localization procedures and do some study on the application of the famous functionalist skopos theory to the social game translation, in hopes of contributing to this industry and its research. ﹀
分类号：	TP312
论文总页数：	65
参考文献总数：	50
馆藏号：	017/M2010(295)
公开日期：	2010-05-27

受控语言在英语技术文档写作中的应用研究.赵晨

链接

题名：	受控语言在英语技术文档写作中的应用研究
姓名：	赵晨
学号：	10817501
论文语种：	chi
专业：	软件工程
公开时间：	公开
培养层次：	硕士
学位：	工程硕士专业学位
培养单位：	北京大学
院系：	软件与微电子学院
导师1姓名：	俞敬松
导师1单位：	软件与微电子学院
导师2姓名：	王逢鑫
导师2单位：	外国语学院
论文答辩日期：	2010-05-27
关键词：	受控语言技术写作英语技术文档文本分析可读性
论文摘要：	︿技术文档写作在企业国际化的过程中扮演着举足轻重的角色。语言是写作的基石。技术文档的写作语言对文档的可读性和功能性都有着巨大的影响。为了提高技术文档的写作质量，英文技术文档应当采用一定的语言规则进行规范化写作。为此，受控语言被开发了出来。受控语言是普通语言的子集，通过限制词汇和句法规则达到规范写作的目的。本文主要围绕笔者在汤姆逊宽带研发公司实习期间所参与的受控语言英语技术文档写作项目展开。首先，在分析汤姆逊现有英语技术文档的基础上，指出英语技术文档写作中存在的一些问题。接着，通过分析业内已有的四种受控语言，选择受控语言规则进行文本改写实验，并对规则的适用性进行分析，最终选择适合的受控语言规则进行新产品的英语技术文档写作。然后，通过研究应用受控语言写作英语技术文档的实际项目案例，探讨受控语言英文技术文档写作项目的实施与执行。并结合受控语言规则在该项目中的应用实例，分析受控语言在解决汤姆逊文档问题上的优势。最后，本文对项目所产出的文本进行了分析，探讨受控语言的使用对英语技术文档的写作质量和写作过程的影响，为想要引进受控语言的企业文档部门提供了可供参考的经验数据。数据表明，应用受控语言写作的英语技术文档与普通英语技术文档相比可读性有明显提高，其中受控语言规则在指示型文本中的应用效果更明显。此外，文档写作时间会因受控语言的使用有所延长，但因为后期编辑工程量的减少，整个写作项目所用时间并不会受到影响。﹀
分类号：	H152.3/TP311.52
论文总页数：	58
参考文献总数：	23
馆藏号：	017/M2010(351)
公开日期：	2010-05-27

PKU CAT

论文排版

CAT历年论文

Table of Contents

2019-05-30

基于深度学习的自动句法纠错研究.黄浩洋

2019-05-29

基于自然语言处理的学生英文检错规则抽取研究.杨越

基于深度学习的视频行为识别研究.常志勇

辅助写作的语料库查询系统设计与实现.胡盖蕾

基于文献的中医经方靶点预测关键技术研究.张琢

基于网络表示学习的科技简报自动生成关键技术研究.张越

基于文本分析与计算的科技政策扩散关键技术研究.张丽颖

基于蒙特卡罗算法的皮肤病诊疗路径关键技术研究.张瑾

面向领域的先进技术侦测关键技术研究.张茜

基于层次条件变分自编码器的政府公文自动生成系统的设计与实现.邓雅妮

一种英语写作知识点推荐策略.Tianfang Gao

富信息古籍整理平台的设计与研究.刘晓娟

公文辅助阅读平台的设计与实现.何寒松

多功能古籍协同研究平台的研究与设计.邓娟

2019-05-27

大学英语写作学习平台游戏化设计研究与实践.戴欣怡

中文文本分析量化指标体系的研究与应用.杨雨萌

医学英语词典的研究与设计.尹梦佳

多维度智能英语词汇学习知识库研究.屠少辉

法律英语词汇学习系统研究与设计.包珍

基于思考帽理论的合作探究教学设计与实证.陈钗平

自适应英语写作系统社交模块的设计与实践.陈陟

面向考试应用的托福积极词汇学习微信小程序的设计.黄郭钰慧

出版审校流程中专业审校与目标读者审校的对比研究——以《培养小极客》为例.张心彧

京剧回译中的文化还原策略——以《伶界大王：1870-1937年京剧再造时期的演员与公众》为例.汪楚楠

翻译中的原型效应转移策略探究——以《推和敲》为例.杨舒涵

针对英语词汇石化问题的自适应词块系统研究与设计.王丽君

海外汉学著作精准回译策略研究——以《中国武术：从古代到21世纪》为例.钱康

基于语料库方法研究G.K.切斯特顿的反犹问题.窦蕾

2019-05-24

英文汉学著作的汉译： 回译和变译.房一品

《译者的取与舍——简析英译汉的异化归化策略》.江皓如

2019-05-23

汉语“V-的”结构中的“的”及其锚定功能.叶永青

2019-05-20

供应链金融下中小企业信用评级研究 -以工程机械行业为例.孙浩

国际视角下建筑行业协会合作对建筑职业培训效果影响的研究.田志伟

2018-11-30

中国技术写作认证考试设计与实证.阮羽

医学英语词汇学习系统研究与设计.荣岩

基于多模态理论和图式理论的雅思听说学习系统的研究与设计.周璇

基于模拟方法的技术写作同源开发教学研究.杨爱萍

2018-06-06

指称理论对于生成语法的必要性.张振宝

2018-05-27

英汉翻译中的变通与忠实.张英杰

2018-05-26

基于深度学习的文本语句扩展系统的设计与实现.于昌和

基于多人在线战术竞技游戏的虚拟团队数据分析与研究.曾伊蕾

基于神经网络的影视剧向量表示模型.隋春宁

面向移动端的用户检索实体抽取系统设计与实现.曹圣明

基于笔画的中文字向量模型设计与研究.赵浩新

英语智能写作个性化辅助系统的设计与实现.赵恩辉

基于深度学习的英文手写识别的设计与实现.王文杰

基于机器学习的作文分析系统设计与实现.李海涛

基于深度学习的英语语法纠错系统的设计与实现.陈宏业

基于深度学习的英语口语发音评测系统的设计与实现.吴琼

面向英语智能学习的知识库系统的设计与实现.梁彪

基于深度学习的实体关系抽取的研究.唐弘毅

数据驱动的海洋意识评价指标体系的构建与实证研究.王一博

基于深度神经网络的弱监督人脸识别方法研究.于程程

基于paraphrase generation的英语作文辅导功能的后端设计和实现.万泽宇

面向教育类视频的摘要生成技术研究与实现.帅远华

面向专业领域的自动综述关键技术研究.涂梦

2018-05-25

面向显隐式语法教学的学习材料加工和教学优化研究.林凤怡

基于支架式理论的技术文档写作教学研究.闫晓宁

中式英语的自动检测研究与应用.于婵

Keystroke logging 评估的技术写作和术语教学研究.钟梦俐

服务于中小学教师的在线研修系统的设计与实现.李贺

翻转课堂教学的游戏化设计和实证研究.吴丹

基于语块和数据分析的高中英语写作一体化的教学研究.迟蕊沂

面向读写一体化的英语写作系统的研究与设计.刘玥杉

官话方言翻译黑人英语的策略研究——以《绝非虚构：我的人生教训》为例.徐靖凯

英文汉学著作的汉译：回译和变译.房一品

科幻虚构词的偏离手段及翻译策略——以《神秘博士：耀眼的黑暗》为例.宋雅雯

最佳教学实践指引下的英语词汇学习系统前端设计与实现.徐冉