[Purpose/Significance] Through the investigation and analysis of foreign government Web archiving projects, this paper explores the practical path suitable for Chinese government Web archiving.[Method/Process] The author selected 7 projects of government Web archiving in 4 foreign countries as analysis cases. According to the Web archiving lifecycle model (WALCM), an integrated project analysis framework of "organization-resource-risk management-utilization service" was constructed, and multiple cases were analyzed under this framework.[Result/Conclusion] Four enlightenments are put forward:multi agent collaborative organization mode, carrying out collection content and frequency with the help of resource evaluation results, establishing a risk management mechanism suitable for its own attributes, and creating smart services for government network archiving project.
Wang Ping
,
Zhou Xia
,
Li Yining
,
Huang Xinping
,
Chen Weidong
. Comparison and Reference of Foreign Government Web Archiving Projects[J]. Library and Information Service, 2022
, 66(17)
: 15
-24
.
DOI: 10.13266/j.issn.0252-3116.2022.17.002
[1] 中国互联网络信息中心.《第49次中国互联网络发展状况统计报告》[EB/OL].[2022-04-11].https://m.thepaper.cn/baijiahao_16870146.
[2] 胡吉明,张晓娟,谭婧.我国政府信息资源研究的主题结构与演化态势[J].信息资源管理学报,2018,8(3):54-63,36.
[3] PETER L. Archiving the World Wide Web[EB/OL].[2022-02-02].https://www.clir.org/pubs/reports/pub106/web/#1.
[4] 中国政府公开信息整合服务平台[EB/OL].[2022-02-21].http://govinfo.nlc.gov.cn.
[5] 李明华.在全国档案局长馆长会议上的工作报告[EB/OL].[2022-03-26]. http://www.saac.gov.cn/news/2018-01/22/content_219103.htm.
[6] 刘准.政府网络信息存档策略研究及系统实现[J].中国档案,2017(12):60-61.
[7] 顾品浩.基于综合档案馆视角的政府网络信息存档组织机制研究[D].天津:天津师范大学,2014.
[8] 李宗富,黄新平.基于5W2H视角的政府网站信息存档研究[J].档案学通讯,2016(2):68-72.
[9] 黄新平.基于云计算的政府网站网页在线归档管理平台构建研究[J].北京档案,2019(12):16-20.
[10] 黄新平.基于区块链的政府网站信息资源安全保存技术策略研究[J].图书馆,2019(12):1-6.
[11] 赵蜀蓉,匡亚林,王昆莉,等.政府数字资源长期保存:技术风险及其治理[J].中国行政管理,2021 (12):67-73.
[12] HWANG H C,SHON J G,PARK J S. Design of an enhanced Web archiving system for preserving content integrity with blockchain[J].Electronics,2020,9(8):1-13.
[13] 顾浩峰,赵芳,王前.关于英国政府网站归档项目的思考与借鉴[J].北京档案.2022(1):46-50.
[14] KWON P, PARDO T A, BURKE G B. Interorganizational collaboration and community building for the preservation of state government digital information:lessons from NDIIPP state partnership initiative[J].Goverment information quarterly,2009,26(1):186-192.
[15] 刘青,孔凡莲.中国网络信息存档及其与国外的比较——基于国家图书馆WICP项目的研究[J].图书情报工作,2013,57(18):80-86,93.
[16] Wiki.List of Web archiving initiatives[EB/OL].[2022-03-30]. https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives.
[17] 孙大东.我国档案馆(室)网络信息归档调查研究[J].档案学通讯,2017(4):78-83.
[18] 谢玉雪,郑晓丹.我国政府网页归档的问题与策略[J].山西档案,2021(2):79-88.
[19] YIN R K.Case study research:design and methods[M].London:Sage,2008:15-56.
[20] IIPC. lIPC members are collaborating to build public Web archive collections[EB/OL].[2022-02-12].https://netpreserve.org/.
[21] 何欢欢. 政府网站信息资源保存体系研究[D].武汉:武汉大学,2010.
[22] 曹玲,颜祥林.美国国会图书馆网页归档项目的新动向[J].档案学研究,2018(2):125-128.
[23] ABIGAIL G,GINA J. Web preservation projects at library of congress[EB/OL].[2022-02-15].https://www.archimuse.com/mw2003/papers/grotke/grotke.html.
[24] Archives. North Carolina department of cultural resources standard for automated Web site capture[EB/OL].[2022-02-20].https://archives.ncdcr.gov/media/28/open.
[25] Texas Digital Library. Web archives and large-scale data:perliminary techniques for facilitating research[EB/OL].[2022-03-11].https://tdl-ir.tdl.org/handle/2249.1/57153.
[26] End of term Web archive[EB/OL].[2022-03-10].https://end-of-term.github.io/eotarchive/.
[27] Eotarchive.Project partners[EB/OL].[2022-03-10].http://eotarchive.cdlib.org/partners.html.
[28] Library and archives canada[EB/OL].[2022-02-23]. http://collectionscanada.ca/.
[29] PAUL K.Collecting the government's online documentary heritage goes large scale[EB/OL].[2022-03-12].https://www.nla.gov.au/stories/blog/web-archiving/2015/02/11/the-australian-government-web-archive.
[30] IIPC.National library of Australia[EB/OL].[2022-03-10].https://netpreserve.org/about-us/members/national-library-australia/.
[31] BRIAN H,ROBERTA W. Web archiving lifecycle model[EB/OL].[2022-04-09].https://archiveit.org/blog/learn-more/publications/web-archiving-life-cycle-model/.
[32] IFLA.Preserving the memory of the world in perpetuity:a joint statement on the archiving and preserving of digital information (2002)[EB/OL].[2022-04-16].https://www.ifla.org/publications/preserving-the-memory-of-the-world-in-perpetuity-a-joint-statement- on-the-archiving-and.
[33] MARTIN K E,EUBANK K.The North Carolina state government website archives:a case study of an American government Web archiving project[J]. New review of hypermedia and multimedia,2007,13(1):7-26.
[34] The National Archives.Operational selection policy osp27uk central government Web estate[EB/OL].[2022-04-14].chrome-extension://cdonnmffk daoajfknoeeecmchibpmkmg/assets/pdf/web/viewer.html?file=https%3A%2F%2Fwww.nationalarchives.gov.uk%2Fdocuments%2Finformation-management%2Fosp27.pdf.
[35] About us.Archive-it[EB/OL].[2022-04-10]. https://archive-it.org/.
[36] MirrorWeb. MirrorWeb case study[EB/OL].[2022-04-11]. https://aws.amazon.com/cn/solutions/case-studies/mirrorweb/.
[37] MirrorWeb.Compliance confidence delivered[EB/OL].[2022-04-10].https://www.mirrorweb.com/case-studies/archiving-web-channels-global-asset-manager.
[38] 张涛.网络信息存档中被遗忘权适用的冲突与平衡[J].档案学研究,2020(5):126-133.
[39] The National Archives.Operational selection policy OSP30.[EB/OL].[2022-04-09].https://cdn.nationalarchives.gov.uk/documents/information-management/osp30.pdf.
[40] The National Archives.Takedown and reclosure policy[EB/OL].[2022-04-08]. https://www.nationalarchives.gov.uk/legal/takedown-and-reclosure-policy/.
[41] 冯湘君.国外网络信息存档研究述评[J].情报资料工作,2014(6):55-60.
[42] 张炜,李春明.著作权法中的限制与例外对数字资源长期保存的影响研究[J].图书馆建设,2009(6):1-4.
[43] Legislation-Australia-Federal.Archives Act 1983[EB/OL].[2022-04-11].https://www.legislation.gov.au/Details/C2016C00772.
[44] NIU J. An overview of Web archiving[J]. D-Lib magazine, 2012,18(3/4):1-12.
[45] IIPC.Tools & software[EB/OL].[2022-04-17].https://netpreserve.org/web-archiving/tools-and-software/.
[46] MOHR G, STACK M, RANITOVIC I, et al. An introduction to Heritrix an open source archival quality Web crawler[EB/OL].[2002-04-17].https://citeseerx.ist.psu.edu/viewdoc/download?DOI:10.1.1.676.6877&rep=rep1&type=pdf.
[47] Octoparse Blog.10 best open source Web scraper in 2022[EB/OL].[2022-04-17].https://www.octoparse.com/blog/10-best-open-source-web-scraper.
[48] 王萍,黄新平,张楠雪.国外Web Archive资源开发利用的途径及趋势展望[J].图书馆学研究,2015(23):43-49.
[49] GORSKY M. Sources and resources into the dark domain:the UK Web archive as a source for the contemporary history of public health[J]. Social history of medicine, 2015, 28(3):596-616.
[50] TREVOR O.Web archiving and mainstreaming special collections:the case of the Latin American government documents archive[EB/OL].[2022-04-13].https://blogs.loc.gov/thesignal/2012/06/web-archiving-and-mainstreaming-special-collections-the-case-of-the latin-american-government-documents-archive/.
[51] TOM S, GRAHAM S.Researching user needs:the UK government web archive[EB/OL].[2022-04-12].https://digital.library.unt.edu/ark:/67531/metadc1477155/m2/1/high_res_d/WAC02-HEKLA-Tom_Storrar.pdf.
[52] 黄新荣,曾萨.网页归档推进策略研究——基于网页归档生态系统视角[J].图书馆学研究,2018(16):63-70,16.
[53] 中国新闻网.《国家图书馆互联网信息战略保存项目启动,首家基地落户新浪》[EB/OL].[2022-01-27]. https://baijiahao.baidu.com/s?id=1631311126489795152&wfr=spider&for=pc.
[54] 国家档案局.《政府网站网页归档指南》[EB/OL].[2022-04-05].https://www.saac.gov.cn/daj/hybz/201912/5e653e193b d747659d78783c8c4c8818.shtml.
[55] 陈慧,乐茜,罗慧玉,等.社会记忆视角下网络信息资源归档路径探析——以PANDORA项目为例[J].数字图书馆论坛,2020(6):15-21.
[56] 国务院办公厅.《政府网站发展指引》[EB/OL].[2022-04-09].http://www.gov.cn/zhengce/content/2017-06/08/content_5200760.htm.
[57] Ntionalarchives.User forum minutes and presentations[EB/OL].[2022-03-24].https://www.nationalarchives.gov.uk/about/get-involved/have-your-say/user-forum/user-forum-minutes-and-presentations/.