-
Notifications
You must be signed in to change notification settings - Fork 4
Expand file tree
/
Copy pathaozora_prepare.rb
More file actions
executable file
·73 lines (67 loc) · 1.88 KB
/
aozora_prepare.rb
File metadata and controls
executable file
·73 lines (67 loc) · 1.88 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#!/usr/bin/env ruby
########################################################################
# aozora_prepare.rb: Prepare for Aozora Bunko Text Processing
#
# Description:
# This script processes text files from Aozora Bunko for readability and
# standardization. It converts the encoding, removes annotations, replaces
# full-width spaces, and standardizes newlines.
#
# Author: id774 (More info: http://id774.net)
# Source Code: https://github.com/id774/scripts
# License: The GPL version 3, or LGPL version 3 (Dual License).
# Contact: idnanashi@gmail.com
#
# Usage:
# aozora_prepare.rb [input file] [output file]
#
# Requirements:
# - Ruby Version: 2.0 or later
#
# Version History:
# v1.3 2025-07-01
# Standardized termination behavior for consistent script execution.
# v1.2 2025-06-23
# Unified usage output to display full script header and support common help/version options.
# v1.1 2023-12-06
# Refactored for improved readability and documentation.
# v1.0 2014-01-22
# Initial release.
#
########################################################################
def usage
script = File.expand_path(__FILE__)
in_header = false
File.foreach(script) do |line|
if line.strip.start_with?('#' * 10)
in_header = !in_header
next
end
puts line.sub(/^# ?/, '') if in_header && line.strip.start_with?('#')
end
exit 0
end
class Aozora
def initialize(args)
@infile = args.shift || "in.txt"
@outfile = args.shift || "out.txt"
end
def run
File.open(@infile, "r:Windows-31J:UTF-8") do |source|
File.open(@outfile, "w") do |data|
content = source.read
content.gsub!(/《[^》]+》/, "")
content.gsub!(/ /, " ")
data.print content.gsub(/(\r\n)/, "\n")
end
end
return 0
end
end
if __FILE__ == $0
if ARGV.length == 2
exit(Aozora.new(ARGV).run)
else
usage
end
end