如何批量检测 epub 损坏?通过写程序来实现,求思路
不知道 epubcheck 是不是干这个事情的,感觉不像,介绍说检测是否符合规范,我仅仅知道是否损坏能否打开就行了,规范之类的不重要
不知道 epubcheck 是不是干这个事情的,感觉不像,介绍说检测是否符合规范,我仅仅知道是否损坏能否打开就行了,规范之类的不重要
1 、下载发生错误,只下载了一个 4kb 的文件对齐的垃圾文件
2 、下载一半信息不全
我刚用了 epubcheck 测了一下
坏文件:
“`
Messages: 1 fatal / 0 errors / 0 warnings / 0 infos
“`
好文件:
“`
…此处省略 500 行
ERROR(RSC-005): 507778787564457984.epub/text/part0272.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0273.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0273.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0274.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0274.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0275.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0275.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0276.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0276.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0277.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0277.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0278.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0278.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0279.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0279.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0280.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0280.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_000.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_000.html(10,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_000.html(10,136): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_000.html(9,70): Error while parsing file: Duplicate “8BVE20-a62d2f2e31ed4ca88c23f95b0c6356e7”
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_000.html(10,70): Error while parsing file: Duplicate “8BVE20-a62d2f2e31ed4ca88c23f95b0c6356e7”
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_001.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_002.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_003.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_004.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0281_split_005.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_000.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_000.html(10,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_000.html(10,136): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_000.html(9,70): Error while parsing file: Duplicate “8CTUK0-a62d2f2e31ed4ca88c23f95b0c6356e7”
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_000.html(10,70): Error while parsing file: Duplicate “8CTUK0-a62d2f2e31ed4ca88c23f95b0c6356e7”
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_001.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_002.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_003.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_004.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0282_split_005.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0283.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0283.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0284_split_000.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0284_split_000.html(10,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0284_split_000.html(10,136): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0284_split_000.html(9,70): Error while parsing file: Duplicate “8EQVO0-a62d2f2e31ed4ca88c23f95b0c6356e7”
ERROR(RSC-005): 507778787564457984.epub/text/part0284_split_000.html(10,70): Error while parsing file: Duplicate “8EQVO0-a62d2f2e31ed4ca88c23f95b0c6356e7”
ERROR(RSC-005): 507778787564457984.epub/text/part0284_split_001.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0284_split_002.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0284_split_003.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0284_split_004.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0285.html(9,70): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
ERROR(RSC-005): 507778787564457984.epub/text/part0285.html(10,63): Error while parsing file: value of attribute “id” is invalid; must be an XML name without colons
Check finished with errors
Messages: 0 fatals / 545 errors / 0 warnings / 0 infos
“`
貌似问题解决了,好像是只要看是否发生致命错误就可以了
epubcheck: https://github.com/w3c/epubcheck